Recipe 9.10. Splitting a Filename into Its Component Parts (Perl Cookbook)

Perl Cookbook

Perl CookbookSearch this book
Previous: 9.9. Renaming FilesChapter 9
Directories
Next: 9.11. Program: symirror
 

9.10. Splitting a Filename into Its Component Parts

Problem

You want to extract a filename, its enclosing directory, or the extension(s) from a string that contains a full pathname.

Solution

Use routines from the standard File::Basename module.

use File::Basename;

$base = basename($path);
$dir  = dirname($path);
($base, $dir, $ext) = fileparse($path);

Discussion

The standard File::Basename module contains routines to split up a filename. dirname and basename supply the directory and filename portions respectively:

$path = '/usr/lib/libc.a';
$file = basename($path);    
$dir  = dirname($path);     

print "dir is $dir, file is $file\n";
# dir is /usr/lib, file is libc.a

The fileparse function can be used to extract the extension. To do so, pass fileparse the path to decipher and a regular expression that matches the extension. You must give fileparse this pattern because an extension isn't necessarily dot-separated. Consider ".tar.gz"--is the extension ".tar", ".gz", or ".tar.gz"? By specifying the pattern, you control which of these you get.

$path = '/usr/lib/libc.a';
($name,$dir,$ext) = fileparse($path,'\..*');

print "dir is $dir, name is $name, extension is $ext\n";
# dir is /usr/lib/, name is libc, extension is .a

By default, these routines parse pathnames using your operating system's normal conventions for directory separators by looking at the $^O variable, which holds a string identifying the system you're running on. That value was determined when Perl was built and installed. You can change the default by calling the fileparse_set_fstype routine. This alters the behavior of subsequent calls to the File::Basename functions:

fileparse_set_fstype("MacOS");
$path = "Hard%20Drive:System%20Folder:README.txt";
($name,$dir,$ext) = fileparse($path,'\..*');

print "dir is $dir, name is $name, extension is $ext\n";
# dir is Hard%20Drive:System%20Folder, name is README, extension is .txt

To pull out just the extension, you might use this:

sub extension {
    my $path = shift;
    my $ext = (fileparse($path,'\..*'))[2];
    $ext =~ s/^\.//;
    return $ext;
}

When called on a file like source.c.bak, this returns an extension of "c.bak", not just "bak". If you wanted just ".bak" returned, use '\..*?' as the second argument to fileparse.

When passed a pathname with a trailing directory separator, such as lib/, fileparse considers the directory name to be "lib/", whereas dirname considers it to be ".".

See Also

The documentation for the standard File::Basename module (also in Chapter 7 of Programming Perl); the entry for $^O in perlvar (1), and in the "Special Variables" section of Chapter 2 of Programming Perl


Previous: 9.9. Renaming FilesPerl CookbookNext: 9.11. Program: symirror
9.9. Renaming FilesBook Index9.11. Program: symirror

Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.