附录 F. 扩展 PHP 3

该章节中的内容可能已经比较过时了,它讲解了如何扩展 PHP3。如果您对 PHP 4 感兴趣,请阅读“Zend API”的有关章节。同时,您还可以阅读 PHP 源文档中的一些文档,例如 README.SELF-CONTAINED-EXTENSIONSREADME.EXT_SKEL

为 PHP 新增函数

函数原型

所有的函数有着如下形式:
void php3_foo(INTERNAL_FUNCTION_PARAMETERS) {
     
}
即使您的函数没有任何参数,这也是它被调用的方式。

函数参数

参数总是属于 pval 类型。该类型包含了各个参数的实际类型。因此,如果您的函数有两个参数,您应该在您的函数顶部做如下工作:

例子 F-1. 取得函数参数

pval *arg1, *arg2;
if (ARG_COUNT(ht) != 2 || getParameters(ht,2,&arg1,&arg2)==FAILURE) {
   WRONG_PARAM_COUNT;
}
注意:参数既可以被值(value)传递又可以被引用(reference)传递。两种情况您都需要将 &(pval *) 传递给 getParameters 函数。如果您想检查第 n 个参数是否通过引用传递给您,您可以使用 ParameterPassedByReference(ht,n) 函数。它的返回值为 0 或者 1。

当您改变了任何已传递的参数,无论它们是以值或者引用的方式传递,您都可以对它使用 pval_destructor 函数来销毁它。或者如果它是一个您想添加的数组,您可以使用类似于 internal_functions.h 中定义函数的方法来将返回值 return_value 作为数组来操作。

此外当你将一个参数改为 IS_STRING 时要确保先给 estrdup() 出来的字符串赋值以及给出字符串长度,然后才将它的类型改为 IS_STRING。如果要修改一个已经是 IS_STRING 或 IS_ARRAY 的参数应该先对其运行 pval_destructor。

可变函数参数

函数可以接受不同数目的参数。如果你的函数可以接受 2 或 3 个参数,这样用:

例子 F-2. 可变函数参数

pval *arg1, *arg2, *arg3;
int arg_count = ARG_COUNT(ht);

if (arg_count < 2 || arg_count > 3 ||
    getParameters(ht,arg_count,&arg1,&arg2,&arg3)==FAILURE) {
    WRONG_PARAM_COUNT;
}

使用函数参数

每个参数的类型存放在 pval 的类型字段中。可以是以下任何一种:

表格 F-1. PHP 内部类型

IS_STRING字符串
IS_DOUBLE双精度浮点型
IS_LONG长整型
IS_ARRAY数组
IS_EMPTYNone
IS_USER_FUNCTION??
IS_INTERNAL_FUNCTION??(如果其中某些不能被传递给一个函数,则删除)
IS_CLASS??
IS_OBJECT??

如果得到了一种类型的参数但是却想按照另一种类型来使用,或者想强制参数为某种类型,可以使用以下的转换函数:
convert_to_long(arg1);
convert_to_double(arg1);
convert_to_string(arg1); 
convert_to_boolean_long(arg1); /* If the string is "" or "0" it becomes 0, 1 otherwise */
convert_string_to_number(arg1);  /* Converts string to either LONG or DOUBLE depending on string */

这些函数都是即时转换的,并不返回任何值。

实际上的参数被存放在一个联合中,成员为:

  • IS_STRING: arg1->value.str.val

  • IS_LONG: arg1->value.lval

  • IS_DOUBLE: arg1->value.dval

函数中的内存管理

函数所使用的任何内存都应该通过 emalloc() 或者 estrdup() 来申请。它们是内存处理抽象函数,看上去就像普通的 malloc() 和 strdup() 函数一样。内存应该用 efree() 来释放。

程序中有两种内存:作为变量被返回到解释器中的内存,以及内部函数所需要的临时存储空间。当把一个字符串赋给一个返回解释器的变量时需要确保首先通过 emalloc() 或者 estrdup() 分配了内存。该内存决不该由你来释放,除非你在同一个函数中后来又覆盖了原来的赋值(这并不是一种好的编程实践)。

任何在你的函数/库中所需要的临时/永久内存都要通过这三个函数来进行:emalloc(),estrdup() 和 efree()。它们的行为完全和它们相对应的函数相同。任何通过 emalloc() 或 estrdup() 都要用 efree() 在某个环节释放,除非它们本来就需要存在到程序结束。否则就会出现内存泄漏。“它们的行为完全和它们相对应的函数相同”的意思是:如果你用 efree() 释放了不是由 emalloc() 或 estrdup() 分配的内存时可能会导致段错误。所以要留意释放所有不需要的内存。

如果编译时指定了 "-DDEBUG",PHP 将在脚本运行结束后显示出所有通过 emalloc() 和 estrdup() 分配但是没有用 efree() 释放的内存列表。

在符号表中设定变量

有一些宏可以使在符号表中设定变量更容易:

  • SET_VAR_STRING(name,value)

  • SET_VAR_DOUBLE(name,value)

  • SET_VAR_LONG(name,value)

警告

要注意 SET_VAR_STRING。值的部分必须是人工分配出来的,因为内存管理代码会在以后尝试释放该指针。不要将静态分配的内存传递给 SET_VAR_STRING。

PHP 中的符号表是以散列表来实现的。任何时候 &symbol_table 都是一个指向 'main' 符号表的指针,active_symbol_table points 指向当前活动的符号表(二者在开始时等同,而在函数内不同)。

以下例子使用了 'active_symbol_table'。如果你指明要工作于 'main' 符号表,应该用 &symbol_table 来替代。此外,同样的函数也能像下面说明的作用于数组。

例子 F-3. 检查 $foo 是否存在于符号表

if (hash_exists(active_symbol_table,"foo",sizeof("foo"))) { exists... }
else { doesn't exist }

例子 F-4. 在符号表中找到一个变量的字长

hash_find(active_symbol_table,"foo",sizeof("foo"),&pvalue);
check(pvalue.type);
PHP 中的数组使用了和符号表同一个散列表来实现。这意味着以上两个函数也可以用来检查数组中的变量。

如果要在符号表中定义一个新数组,应该这样做。

首先,要用 hash_exists() 或 hash_find() 检查它是否存在并且正确中止。

接着,初始化数组:

例子 F-5. 初始化新数组

pval arr;
  
if (array_init(&arr) == FAILURE) { failed... };
hash_update(active_symbol_table,"foo",sizeof("foo"),&arr,sizeof(pval),NULL);
这段代码在活动符号表中定义了一个名为 $foo 的新数组。这是个空数组。

这里展示了怎样向其中加入新的条目:

例子 F-6. 向数组中加入新的条目

pval entry;
  
entry.type = IS_LONG;
entry.value.lval = 5;
  
/* defines $foo["bar"] = 5 */
hash_update(arr.value.ht,"bar",sizeof("bar"),&entry,sizeof(pval),NULL); 

/* defines $foo[7] = 5 */
hash_index_update(arr.value.ht,7,&entry,sizeof(pval),NULL); 

/* defines the next free place in $foo[],
 * $foo[8], to be 5 (works like php2)
 */
hash_next_index_insert(arr.value.ht,&entry,sizeof(pval),NULL);
如果要修改一个插入到散列表中的值,必须先将其提取出来。要避免额外开销你可以给 hash add 函数提供一个 pval ** 参数,它将被表中相应单元的 pval * 地址所更新。如果其值为 NULL(如同以上所有的例子),该参数被忽略。

hash_next_index_insert() 或多或少使用了如同 PHP 2.0 中 "$foo[] = bar;" 的逻辑。

如果要建立一个从函数返回的数组,可以就像以上那样初始化数组:

if (array_init(return_value) == FAILURE) { failed...; }

接着用有用的函数加入值:

add_next_index_long(return_value,long_value);
add_next_index_double(return_value,double_value);
add_next_index_string(return_value,estrdup(string_value));

当然了,如果如果在数组初始化后添加不正确,你应该先查看一下数组:
pval *arr;
  
if (hash_find(active_symbol_table,"foo",sizeof("foo"),(void **)&arr)==FAILURE) { can't find... }
else { use arr->value.ht... }

注意 hash_find 接受一个指向 pval 指针的指针,并不是 pval 指针。

以上任何 hash 函数返回 SUCCESS 或 FAILURE(除了 hash_exists(),它返回一个布尔真值)。

Returning simple values

A number of macros are available to make returning values from a function easier.

The RETURN_* macros all set the return value and return from the function:

  • RETURN

  • RETURN_FALSE

  • RETURN_TRUE

  • RETURN_LONG(l)

  • RETURN_STRING(s,dup) If dup is TRUE, duplicates the string

  • RETURN_STRINGL(s,l,dup) Return string (s) specifying length (l).

  • RETURN_DOUBLE(d)

The RETVAL_* macros set the return value, but do not return.

  • RETVAL_FALSE

  • RETVAL_TRUE

  • RETVAL_LONG(l)

  • RETVAL_STRING(s,dup) If dup is TRUE, duplicates the string

  • RETVAL_STRINGL(s,l,dup) Return string (s) specifying length (l).

  • RETVAL_DOUBLE(d)

The string macros above will all estrdup() the passed 's' argument, so you can safely free the argument after calling the macro, or alternatively use statically allocated memory.

If your function returns boolean success/error responses, always use RETURN_TRUE and RETURN_FALSE respectively.

Returning complex values

Your function can also return a complex data type such as an object or an array.

Returning an object:

  1. Call object_init(return_value).

  2. Fill it up with values. The functions available for this purpose are listed below.

  3. Possibly, register functions for this object. In order to obtain values from the object, the function would have to fetch "this" from the active_symbol_table. Its type should be IS_OBJECT, and it's basically a regular hash table (i.e., you can use regular hash functions on .value.ht). The actual registration of the function can be done using:
    add_method( return_value, function_name, function_ptr );

The functions used to populate an object are:

  • add_property_long( return_value, property_name, l ) - Add a property named 'property_name', of type long, equal to 'l'

  • add_property_double( return_value, property_name, d ) - Same, only adds a double

  • add_property_string( return_value, property_name, str ) - Same, only adds a string

  • add_property_stringl( return_value, property_name, str, l ) - Same, only adds a string of length 'l'

Returning an array:

  1. Call array_init(return_value).

  2. Fill it up with values. The functions available for this purpose are listed below.

The functions used to populate an array are:

  • add_assoc_long(return_value,key,l) - add associative entry with key 'key' and long value 'l'

  • add_assoc_double(return_value,key,d)

  • add_assoc_string(return_value,key,str,duplicate)

  • add_assoc_stringl(return_value,key,str,length,duplicate) specify the string length

  • add_index_long(return_value,index,l) - add entry in index 'index' with long value 'l'

  • add_index_double(return_value,index,d)

  • add_index_string(return_value,index,str)

  • add_index_stringl(return_value,index,str,length) - specify the string length

  • add_next_index_long(return_value,l) - add an array entry in the next free offset with long value 'l'

  • add_next_index_double(return_value,d)

  • add_next_index_string(return_value,str)

  • add_next_index_stringl(return_value,str,length) - specify the string length

Using the resource list

PHP has a standard way of dealing with various types of resources. This replaces all of the local linked lists in PHP 2.0.

Available functions:

  • php3_list_insert(ptr, type) - returns the 'id' of the newly inserted resource

  • php3_list_delete(id) - delete the resource with the specified id

  • php3_list_find(id,*type) - returns the pointer of the resource with the specified id, updates 'type' to the resource's type

Typically, these functions are used for SQL drivers but they can be used for anything else; for instance, maintaining file descriptors.

Typical list code would look like this:

例子 F-7. Adding a new resource

RESOURCE *resource;

/* ...allocate memory for resource and acquire resource... */
/* add a new resource to the list */
return_value->value.lval = php3_list_insert((void *) resource, LE_RESOURCE_TYPE);
return_value->type = IS_LONG;

例子 F-8. Using an existing resource

pval *resource_id;
RESOURCE *resource;
int type;

convert_to_long(resource_id);
resource = php3_list_find(resource_id->value.lval, &type);
if (type != LE_RESOURCE_TYPE) {
	php3_error(E_WARNING,"resource index %d has the wrong type",resource_id->value.lval);
	RETURN_FALSE;
}
/* ...use resource... */

例子 F-9. Deleting an existing resource

pval *resource_id;
RESOURCE *resource;
int type;

convert_to_long(resource_id);
php3_list_delete(resource_id->value.lval);
The resource types should be registered in php3_list.h, in enum list_entry_type. In addition, one should add shutdown code for any new resource type defined, in list.c's list_entry_destructor() (even if you don't have anything to do on shutdown, you must add an empty case).

Using the persistent resource table

PHP has a standard way of storing persistent resources (i.e., resources that are kept in between hits). The first module to use this feature was the MySQL module, and mSQL followed it, so one can get the general impression of how a persistent resource should be used by reading mysql.c. The functions you should look at are:

php3_mysql_do_connect
php3_mysql_connect()
php3_mysql_pconnect()

The general idea of persistence modules is this:

  1. Code all of your module to work with the regular resource list mentioned in section (9).

  2. Code extra connect functions that check if the resource already exists in the persistent resource list. If it does, register it as in the regular resource list as a pointer to the persistent resource list (because of 1., the rest of the code should work immediately). If it doesn't, then create it, add it to the persistent resource list AND add a pointer to it from the regular resource list, so all of the code would work since it's in the regular resource list, but on the next connect, the resource would be found in the persistent resource list and be used without having to recreate it. You should register these resources with a different type (e.g. LE_MYSQL_LINK for non-persistent link and LE_MYSQL_PLINK for a persistent link).

If you read mysql.c, you'll notice that except for the more complex connect function, nothing in the rest of the module has to be changed.

The very same interface exists for the regular resource list and the persistent resource list, only 'list' is replaced with 'plist':

  • php3_plist_insert(ptr, type) - returns the 'id' of the newly inserted resource

  • php3_plist_delete(id) - delete the resource with the specified id

  • php3_plist_find(id,*type) - returns the pointer of the resource with the specified id, updates 'type' to the resource's type

However, it's more than likely that these functions would prove to be useless for you when trying to implement a persistent module. Typically, one would want to use the fact that the persistent resource list is really a hash table. For instance, in the MySQL/mSQL modules, when there's a pconnect() call (persistent connect), the function builds a string out of the host/user/passwd that were passed to the function, and hashes the SQL link with this string as a key. The next time someone calls a pconnect() with the same host/user/passwd, the same key would be generated, and the function would find the SQL link in the persistent list.

Until further documented, you should look at mysql.c or msql.c to see how one should use the plist's hash table abilities.

One important thing to note: resources going into the persistent resource list must *NOT* be allocated with PHP's memory manager, i.e., they should NOT be created with emalloc(), estrdup(), etc. Rather, one should use the regular malloc(), strdup(), etc. The reason for this is simple - at the end of the request (end of the hit), every memory chunk that was allocated using PHP's memory manager is deleted. Since the persistent list isn't supposed to be erased at the end of a request, one mustn't use PHP's memory manager for allocating resources that go to it.

When you register a resource that's going to be in the persistent list, you should add destructors to it both in the non-persistent list and in the persistent list. The destructor in the non-persistent list destructor shouldn't do anything. The one in the persistent list destructor should properly free any resources obtained by that type (e.g. memory, SQL links, etc). Just like with the non-persistent resources, you *MUST* add destructors for every resource, even it requires no destruction and the destructor would be empty. Remember, since emalloc() and friends aren't to be used in conjunction with the persistent list, you mustn't use efree() here either.

Adding runtime configuration directives

Many of the features of PHP can be configured at runtime. These configuration directives can appear in either the designated php3.ini file, or in the case of the Apache module version in the Apache .conf files. The advantage of having them in the Apache .conf files is that they can be configured on a per-directory basis. This means that one directory may have a certain safemodeexecdir for example, while another directory may have another. This configuration granularity is especially handy when a server supports multiple virtual hosts.

The steps required to add a new directive:

  1. Add directive to php3_ini_structure struct in mod_php3.h.

  2. In main.c, edit the php3_module_startup function and add the appropriate cfg_get_string() or cfg_get_long() call.

  3. Add the directive, restrictions and a comment to the php3_commands structure in mod_php3.c. Note the restrictions part. RSRC_CONF are directives that can only be present in the actual Apache .conf files. Any OR_OPTIONS directives can be present anywhere, include normal .htaccess files.

  4. In either php3take1handler() or php3flaghandler() add the appropriate entry for your directive.

  5. In the configuration section of the _php3_info() function in functions/info.c you need to add your new directive.

  6. And last, you of course have to use your new directive somewhere. It will be addressable as php3_ini.directive.