Recipe 9.6. Globbing, or Getting a List of Filenames Matching a Pattern (Perl Cookbook)

Perl Cookbook

Perl CookbookSearch this book
Previous: 9.5. Processing All Files in a DirectoryChapter 9
Directories
Next: 9.7. Processing All Files in a Directory Recursively
 

9.6. Globbing, or Getting a List of Filenames Matching a Pattern

Problem

You want to get a list of filenames similar to MS-DOS's *.* and Unix's *.h (this is called globbing).

Solution

Perl provides globbing with the semantics of the Unix C shell through the glob keyword and < >:

@list = <*.c>;
@list = glob("*.c");

You can also use readdir to extract the filenames manually:

opendir(DIR, $path);
@files = grep { /\.c$/ } readdir(DIR);
closedir(DIR);

The CPAN module File::KGlob does globbing without length limits:

use File::KGlob;

@files = glob("*.c");

Discussion

Perl's built-in glob and <WILDCARD> notation (not to be confused with <FILEHANDLE>) currently use an external program to get the list of filenames on most platforms. This program is csh on Unix,[2] and a program called dosglob.exe on Windows. On VMS and the Macintosh, file globs are done internally without an external program. Globs are supposed to give C shell semantics on non-Unix systems to encourage portability. The use of the shell on Unix also makes this inappropriate for setuid scripts.

[2] Usually. If tcsh is installed, Perl uses that because it's safer. If neither is installed, /bin/sh is used.

To get around this, you can either roll your own selection mechanism using the built-in opendir or CPAN's File::KGlob, neither of which uses external programs. File::KGlob provides Unix shell-like globbing semantics, whereas opendir lets you select files with Perl's regular expressions.

At its simplest, an opendir solution uses grep to filter the list returned by readdir:

@files = grep { /\.[ch]$/i } readdir(DH);

You could also do this with the DirHandle module:

use DirHandle;

$dh = DirHandle->new($path)   or die "Can't open $path : $!\n";
@files = grep { /\.[ch]$/i } $dh->read();

As always, the filenames returned don't include the directory. When you use the filename, you'll need to prepend the directory name:

opendir(DH, $dir)        or die "Couldn't open $dir for reading: $!";

@files = ();
while( defined ($file = readdir(DH)) ) {
    next unless /\.[ch]$/i;

    my $filename = "$dir/$file";
    push(@files, $filename) if -T $file;
}

The following example combines directory reading and filtering with the Schwartzian Transform from Chapter 4, Arrays, for efficiency. It sets @dirs to a sorted list of the subdirectories in a directory whose names are all numeric:

@dirs = map  { $_->[1] }                # extract pathnames
        sort { $a->[0] <=> $b->[0] }    # sort names numeric
        grep { -d $_->[1] }             # path is a dir
        map  { [ $_, "$path/$_" ] }     # form (name, path)
        grep { /^\d+$/ }                # just numerics
        readdir(DIR);                   # all files

Recipe 4.15 explains how to read these strange-looking constructs. As always, formatting and documenting your code can make it much easier to read and understand.

See Also

The opendir, readdir, closedir, grep, map, and sort functions in perlfunc (1) and in Chapter 3 of Programming Perl; documentation for the standard DirHandle module (also in Chapter 7 of Programming Perl); the "I/O Operators" section of perlop (1), and the "Filename Globbing Operator" section of Chapter 2 of Programming Perl; we talk more about globbing in Recipe 6.9; Recipe 9.7


Previous: 9.5. Processing All Files in a DirectoryPerl CookbookNext: 9.7. Processing All Files in a Directory Recursively
9.5. Processing All Files in a DirectoryBook Index9.7. Processing All Files in a Directory Recursively

Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.