19.2 Embedding Overview (Advanced Perl Programming)

19.2 Embedding Overview

Strange as it may seem, there are no tools to automate the task of embedding Perl as there are for extending Perl. Why is that? After all, extensions also have to account for translating data from Perl to C and back (input and output parameters). The reason is that when Perl drives C code, it specifies precisely how and when a C extension is loaded. As an extension writer, you have the job of simply writing XSUBs in a callback style, providing some initializations; the XSUBs will be called when the script invokes the appropriate corresponding functions. In contrast, since there is no standard way to write a C application, you have to decide when to initialize an embedded Perl interpreter and how to give control over to a Perl script.

To simplify embedding, this chapter shows you an easy-to-use veneer over Perl's internal API. These routines have been developed for this book to save you the bother of assimilating over 50 pages of internal documentation. But if you are the type who thrives on such details, Chapter 20, Perl Internals, should provide the needed fix. It also explains the code for these convenience routines.

It so happens that the Perl executable is made up of two parts: a library of core Perl routines[2] (libperl.a on Unix systems and perl.lib on Microsoft Windows systems, or dynamically loadable equivalents of the same) and a simple driver file, perlmain.c, containing main(), which, shorn of all its portability aspects, looks like this:

 #include <EXTERN.h> 
 #include <perl.h> 
 static PerlInterpreter *my_perl;  
 int main(int argc, char **argv, char **env)
 {
     my_perl = perl_alloc();                        #
     perl_construct(my_perl);                       # Initialize

     perl_parse(my_perl, xs_init, argc, argv, env); #
     perl_run(my_perl);                             # Run

     perl_destruct(my_perl);                        # Shut down
     perl_free(my_perl);
 }

[2] Not to be confused with the lib directory in a Perl distribution.

perl_alloc and perl_construct create an interpreter object. perl_parse does some more initializations, parses the command-line parameters provided to it via argc and argv, calls an initialization routine, xs_init, to load other extensions (or to at least initialize the dynamic loader), and finally parses the script provided as part of the command line. perl_run executes the script. Finally, perl_destruct and perl_free shut down and deallocate the interpreter.

To take advantage of the power of Perl, all you need to do is link the Perl library to your application and essentially clone the code in perlmain.c. We will talk about xs_init in the section "Adding Extensions" later in this chapter; until then, we will assume that we don't need any extensions and pass NULL to perl_parse instead of xs_init. The interpreter is fully primed once perl_parse is done, after which you can call all functions exported by the Perl library. In this chapter, however, we will restrict ourselves to a few high-level calls, listed in Table 19.1.

Table 19.1: Perl API Calls for Easy Embedding
Function Name	Description
perl_call_argv( char sub, I32 flags, char *argv);	This call is available in the standard Perl distribution. It calls a subroutine with an array of string arguments terminated by NULL. Unfortunately, it doesn't return results in a convenient way. For this reason, the only flag we will use in this chapter is G_DISCARD, to tell Perl to silently discard all returned results.
perl_call_va ( char sub, [char type, arg,]* ["OUT",] [char type, arg,] NULL );	This provides a convenient interface for passing a null-terminated list of typed parameters to a Perl subroutine and to collect the returned results into a list of parameters (similar to `printf` and `scanf`). The `type` argument can be `i`, `s`, or `d` (integer, string, double). The string `OUT` begins a list of return parameters, which are pairs of type specifiers and appropriately typed pointers. String output parameters are copied into the buffers supplied, which consequently should have enough space to absorb the returned strings. The parameter list must always be NULL-terminated. The function returns -1 on failure and the number of parameters returned by the procedure, if successful.
int perl_eval_va( char str, [char type, *arg], NULL);	Evaluates an arbitrary string, not just a subroutine. The string can be followed by any number of out parameters in the style discussed above. It does not need input parameters because they are already encoded in the string. `perl_eval_va` returns -1 on failure, or the number of result parameters returned by the evaluation.
set_int(char var, int value); int get_int( char var, int *pvalue);	Gets or sets a globally accessible, integer-valued scalar. `var` can contain ordinary scalar variable names or array and hash indices as follows: `foo`, `foo[10]`, or `foo{hello}`. `get_int` takes a pointer to an integer and returns 1 if successful (or 0 on failure). `set_int` creates a variable if it doesn't already exist.
set_double(char var, double value); int get_double( char var, double *pvalue);	Similar to above.
set_str(char var, char value); int get_str(char var, char *value);	`get_str` returns the address of the string. You are expected to copy it into your own buffer.

The get_* and set_* functions can be used only to manipulate a scalar at a time. The reason I allowed this limitation is that Perl already provides a nice chunk of functions that can slice, dice, and iterate through arrays and hashes; we'll take a detailed look at them in Chapter 20. These functions, while faster and more fine-grained, are tied to internals-related details (memory management, temporary variables, and so on); hence any discussion of them necessitates discussing these other aspects too. The get_* and set_* functions are simpler.


19.1 Why Embed?		19.3 Examples