=head1 NAME mod_perl internals: Apache 2.0 Integration =head1 Description This document should help to understand the initialization, request processing and shutdown process of the mod_perl module. This knowledge is essential for a less-painful debugging experience. It should also help to know where a new code should be added when a new feature is added. Internals of mod_perl-specific features are discussed in L. Make sure to read also: L. =head1 Startup Apache starts itself and immediately restart itself. The following sections discuss what happens to mod_perl during this period. =head2 The Link Between mod_perl and httpd I includes a special data structure: module AP_MODULE_DECLARE_DATA perl_module = { STANDARD20_MODULE_STUFF, modperl_config_dir_create, /* dir config creater */ modperl_config_dir_merge, /* dir merger --- default is to override */ modperl_config_srv_create, /* server config */ modperl_config_srv_merge, /* merge server config */ modperl_cmds, /* table of config file commands */ modperl_register_hooks, /* register hooks */ }; Apache uses this structure to hook mod_perl in, and it specifies six custom callbacks which Apache will call at various stages that will be explained later. C is a standard macro defined in I. Currently its main use is for attaching Apache version magic numbers, so the previously compiled module won't be attempted to be used with newer Apache versions, whose API may have changed. C is a struct, that defines the mod_perl configuration directives and the callbacks to be invoked for each of these. =head1 Configuration Tree Building At the C stage the configuration file is parsed and stored in a parsed configuration tree is created. Some sections are stored unmodified in the parsed configuration tree to be processed after the C hooks were run. Other sections are processed right away (e.g., the C directive includes extra configuration and has to include it as soon as it was seen) and they may or may not add a subtree to the configuration tree. C feeds the configuration file lines from to C, which tokenizes the input, and uses the first token as a potential directive (command). It then calls C to find a module that has registered that command (remember mod_perl has registered the directives in the C C array, which was passed to C inside the C struct?). If that command is found and it has the C flag set in its I field, the callback for that command is invoked. Depending on the command, it may perform some action and return (e.g., C), or it may continue reading from the configuration file and recursively execute other nested commands till it's done (e.g., CLocation ...E>). If the command is found but the C flag is not set or the command is not found, the current node gets added to the configuration tree and will be processed during the C stage, after the C stage will be over. If the command needs to be executed at this stage as it was just explained, C invokes the corresponding callback with C. Since C directive has the C flag set, that directive is executed as soon as it's seen and the modules its supposed to load get loaded right away. For mod_perl loaded as a DSO object, this is when mod_perl starts its game. =head2 Enabling the mod_perl Module and Installing its Callbacks mod_perl can be loaded as a DSO object at startup time, or be prelinked at compile time. For statically linked mod_perl, Apache enables mod_perl by calling C, which happens during the C stage. The latter is happening before the configuration file is parsed. When mod_perl is loaded as DSO: LoadModule perl_module "modules/mod_perl.so" mod_dso's C first loads the shared mod_perl object, and then immediately calls C which calls C to enable mod_perl. C adds the C structure to the top of chained module list and calls C which calls the C callback. This is the very first mod_perl hook that's called by Apache. C registers all the hooks that it wants to be called by Apache when the appropriate time comes. That includes configuration hooks, filter, connection and http protocol hooks. From now on most of the relationship between httpd and mod_perl is done via these hooks. Remember that in addition to these hooks, there are four hooks that were registered with C, and there are: C, C, C and C. Finally after the hooks were registered, C (called from mod_dso's C in case of DSO) runs the configuration process for the module. First it calls the C callback for the main server, followed by the C callback to create a directory structure for the main server. Notice that it passes C for the directory path, since we at the very top level. If you need to do something as early as possible at mod_perl's startup, the C is the right place to do that. For example we add a C define to the C here: *(char **)apr_array_push(ap_server_config_defines) = apr_pstrdup(p, "MODPERL2"); so the following code will work under mod_perl 2.0 enabled Apache without explicitly passing C<-DMODPERL2> at the server startup: # 2.0 configuration PerlSwitches -wT This section, of course, will see the define only if inserted after the C, because that's when C is called. One inconvenience with using that hook, is that the server object is not among its arguments, so if you need to access that object, the next earliest function is C. However remember that it'll be called once for the main server and one more time for each virtual host, that has something to do with mod_perl. So if you need to invoke it only for the main server, you can use a Cis_virtual> conditional. For example we need to enable the debug tracing as early as possible, but we need the server object in order to do that, so we perform this setting in C: if (!s->is_virtual) { modperl_trace_level_set(s, NULL); } =head1 The C Phase After Apache processes its command line arguments, creates various pools and reads the configuration file in, it runs the registered I hooks by calling C. That's when C is called. And it does nothing. =head2 Configuration Tree Processing C calls C, which scans through all directives in the parsed configuration tree, and executes each one by calling C. This is a recursive process with many twists. Similar to C for each command (directive) in the configuration tree, it calls C to find a module that registered that command. If the command is not found the server dies. Otherwise the callback for that command is invoked with C, after fetching the current directory configuration: invoke_cmd(cmd, parms, dir_config, current->args); The C command is the one that invokes mod_perl's directives callbacks, which reside in I. C knows how the arguments should be passed to the callbacks, based on the information in the C array that we have just mentioned. Notice that before C is invoked, C is called which sets the current server and section configuration objects for the module in which the directive has been found. If these objects were't created yet, it calls the registered callbacks as C and C, which are C and C for the mod_perl module. (If you write your L, these correspond to the C and C Perl subroutines.) The command callback won't be invoked if it has the C flag set, because it was already invoked earlier when the configuration tree was parsed. C is called in any case, because it wasn't called during the C. So we have C and C both called once for the main server (at the end of processing the C directive), and one more time for each virtual host in which at least one mod_perl directive is encountered. In addition C is called for every section and subsection that includes mod_perl directives (META: or inherits from such a section even though specifies no mod_perl directives in it?). =head2 Virtual Hosts Fixup After the configuration tree is processed, C is called. One of the responsibilities of this function is to merge the virtual hosts configuration objects with the base server's object. If there are virtual hosts, C calls C and C for each virtual host, to perform this merge for mod_perl configuration objects. META: is that's the place where everything restarts? it doesn't restart under debugger since we run with NODETACH I believe. =head2 The I Phase After Apache processes the configuration it's time for the I phase, executed by C. mod_perl has registered the C hook to be called for this phase. META: complete what happens at this stage in mod_perl META: why is it called modperl_hook_init and not open_logs? is it because it can be called from other functions? =head2 The I Phase Immediately after I, the I phase follows. Here C calls C =head1 Request Processing META: need to write =head1 Shutdown META: need to write =head1 Maintainers Maintainer is the person(s) you should contact with updates, corrections and patches. =over =item * Stas Bekman [http://stason.org/] =back =head1 Authors =over =item * =back Only the major authors are listed above. For contributors see the Changes file. =cut