In part 1 , we built a PHP extension that loads, shows up in phpinfo, and exposes a function that does nothing. Time to make it do something.
The plan is simple: PHP stores all its functions in a hash table. We have the function name, we have a replacement closure. We just… swap them. Right?
How PHP Stores Functions
Every PHP function (built-in, user-defined, from extensions) ends up in a hash table called EG(function_table). The key is the function name (lowercased), the value is a zend_function struct.
zend_function is actually a union of two different structs:
union _zend_function {
zend_uchar type; /* ZEND_INTERNAL_FUNCTION or ZEND_USER_FUNCTION */
struct { ... } common; /* shared fields: name, scope, arg_info, flags */
zend_op_array op_array; /* user functions: PHP bytecode */
zend_internal_function internal_function; /* C functions: handler pointer */
};
Internal functions (like strlen, array_map, anything written in C) have a handler: a C function pointer. User functions (anything you write in PHP) have an op_array: compiled bytecode that the Zend VM executes.
This distinction is going to matter a lot.
First Attempt: Just Swap It
The idea: look up the function in EG(function_table), build a wrapper zend_internal_function that calls our closure, and replace the entry with zend_hash_update_mem.
Let’s add a replace_function helper to our extension.
funswap.c, add this above the ZEND_FUNCTION(FunSwap_replace):
/* funswap.c — function replacement logic */
static int replace_function(zend_string *fn_name, zval *replacement)
{
zend_string *fn_lc = zend_string_tolower(fn_name);
/* Look up the original function */
zend_function *original = zend_hash_find_ptr(EG(function_table), fn_lc);
if (!original) {
zend_string_release(fn_lc);
return 0; /* function not found */
}
/* Build a wrapper internal function */
zend_internal_function wrapper;
memset(&wrapper, 0, sizeof(wrapper));
wrapper.type = ZEND_INTERNAL_FUNCTION;
wrapper.function_name = zend_string_copy(original->common.function_name);
wrapper.handler = /* ... what goes here? */;
/* Replace in function table */
zend_hash_update_mem(EG(function_table), fn_lc, &wrapper,
sizeof(zend_internal_function));
zend_string_release(fn_lc);
return 1;
}
We’ll deal with the handler in a moment. First, let’s wire this up.
funswap.c, update the replace() implementation:
/* funswap.c — updated ZEND_FUNCTION */
ZEND_FUNCTION(FunSwap_replace) {
zend_string *function_name;
zval *replacement;
ZEND_PARSE_PARAMETERS_START(2, 2)
Z_PARAM_STR(function_name)
Z_PARAM_OBJECT_OF_CLASS(replacement, zend_ce_closure)
ZEND_PARSE_PARAMETERS_END();
RETURN_BOOL(replace_function(function_name, replacement));
}
The Handler Problem
Our wrapper needs a C function handler: the function PHP calls when someone invokes the replaced function. For now, let’s write the simplest possible one: call the replacement closure and return its result.
funswap.c, add this above replace_function:
/* funswap.c — the wrapper handler */
static zval *stored_replacement = NULL;
static void funswap_wrapper_handler(INTERNAL_FUNCTION_PARAMETERS) {
if (!stored_replacement) {
RETURN_NULL();
}
/* Forward all arguments to the replacement closure */
uint32_t argc = ZEND_CALL_NUM_ARGS(execute_data);
zval *args = (argc > 0) ? ZEND_CALL_ARG(execute_data, 1) : NULL;
zval *params = NULL;
if (argc > 0) {
params = emalloc(argc * sizeof(zval));
for (uint32_t i = 0; i < argc; i++) {
ZVAL_COPY(¶ms[i], &args[i]);
}
}
zend_fcall_info fci = empty_fcall_info;
zend_fcall_info_cache fcc = empty_fcall_info_cache;
fci.size = sizeof(fci);
fci.retval = return_value;
fci.param_count = argc;
fci.params = params;
zend_fcall_info_init(stored_replacement, 0, &fci, &fcc, NULL, NULL);
fci.retval = return_value;
fci.params = params;
fci.param_count = argc;
zend_call_function(&fci, &fcc);
for (uint32_t i = 0; i < argc; i++) {
zval_ptr_dtor(¶ms[i]);
}
if (params) efree(params);
}
Yes, that stored_replacement global is terrible. One replacement at a time, no cleanup, no thread safety. We’ll fix it in one of the next parts. Right now we just want to see if the replacement works at all.
Now wire the handler into replace_function:
funswap.c, update replace_function to use the handler:
/* funswap.c — updated replace_function */
static int replace_function(zend_string *fn_name, zval *replacement)
{
zend_string *fn_lc = zend_string_tolower(fn_name);
zend_function *original = zend_hash_find_ptr(EG(function_table), fn_lc);
if (!original) {
zend_string_release(fn_lc);
return 0;
}
/* Store the replacement closure (yes, this is a hack) */
if (stored_replacement) {
zval_ptr_dtor(stored_replacement);
efree(stored_replacement);
}
stored_replacement = emalloc(sizeof(zval));
ZVAL_COPY(stored_replacement, replacement);
/* Build a wrapper internal function */
zend_internal_function wrapper;
memset(&wrapper, 0, sizeof(wrapper));
wrapper.type = ZEND_INTERNAL_FUNCTION;
wrapper.function_name = zend_string_copy(original->common.function_name);
wrapper.fn_flags = ZEND_ACC_VARIADIC;
wrapper.handler = funswap_wrapper_handler;
wrapper.module = &funswap_module_entry;
/* Replace in function table */
zend_hash_update_mem(EG(function_table), fn_lc, &wrapper,
sizeof(zend_internal_function));
zend_string_release(fn_lc);
return 1;
}
Build it. Load it. Try replacing strlen:
$ docker compose run debian php -d extension=modules/funswap.so -r '
\FunSwap\replace("strlen", function(string $s): int {
return 999;
});
echo strlen("hello") . "\n";
'
999
It works. We replaced strlen, a built-in C function that’s been in PHP since day one, with a closure that returns 999. Calling strlen("hello") now returns 999 instead of 5.
That was… easy?
Now Try a User Function
Let’s try replacing a user-defined function:
$ docker compose run debian php -d extension=modules/funswap.so -r '
function greet(string $name): string {
return "Hello, $name!";
}
\FunSwap\replace("greet", function(string $name): string {
return "Yo, $name!";
});
echo greet("World") . "\n";
'
Segmentation fault (core dumped)
Segfault. Not an exception, not a warning. A hard crash.
What Just Happened?
Time for GDB. We build with --enable-debug for exactly this moment:
$ docker compose run debian gdb -batch -ex run -ex bt \
-args php -d extension=modules/funswap.so /tmp/test.php
Program received signal SIGSEGV, Segmentation fault.
zend_init_cvs (first=0, last=1852343716)
at Zend/zend_execute.c:3990
ZVAL_UNDEF(var);
#0 zend_init_cvs (first=0, last=1852343716)
#1 i_init_func_execute_data (op_array=0x...)
#2 ZEND_DO_UCALL_SPEC_RETVAL_USED_HANDLER ()
The crash is in zend_init_cvs, called from i_init_func_execute_data, called from ZEND_DO_UCALL_SPEC_RETVAL_USED_HANDLER.
DO_UCALL. Remember that name. It’s the opcode handler for calling user functions.
The VM is trying to initialize compiled variables (CVs) for what it thinks is a user function with an op_array. But we replaced the function with a zend_internal_function. It doesn’t have an op_array, doesn’t have compiled variables, and has a completely different struct layout. The VM reads op_array.last_var from garbage memory, gets 1852343716, and tries to initialize that many variables. Instant segfault.
Why strlen Worked and greet Didn’t
Here’s the key: PHP compiles function calls to different opcodes depending on the function type at compile time.
When the compiler sees strlen("hello") and looks up strlen in the function table, it finds an internal function. It emits DO_ICALL, the opcode for calling internal functions.
When the compiler sees greet("World") and looks up greet, it finds a user function. It emits DO_UCALL, the opcode for calling user functions.
There’s also DO_FCALL, a generic “call whatever this is” opcode. The compiler uses it when it can’t resolve the function at compile time (variable function calls, calls to functions not yet defined, etc.).
The problem: the entire PHP file is compiled to opcodes BEFORE any line of it executes. By the time our replace() runs and swaps greet from a user function to an internal function, the call site greet("World") has already been compiled to DO_UCALL. The opcode is baked in. Our replacement changes the function table, but the opcode still says “this is a user function, initialize its op_array.”
Internal function → internal function: DO_ICALL handles both. Works fine.
User function → internal function: DO_UCALL can’t handle an internal function. Segfault.
So… Is This Impossible?
I stared at this for a while. The GDB backtrace told me where it crashed, not why only user functions crashed. I read through zend_vm_execute.h, tried wrapping the replacement in different ways, even considered patching the op_array in place. It took me an embarrassingly long time to realize the answer wasn’t in our C code at all. It was in how PHP compiles function calls.
The opcode is only DO_UCALL when the compiler knows the callee is a user function at compile time. When the function can’t be resolved (it’s in a different file that hasn’t been compiled yet), the compiler falls back to DO_FCALL, which handles both types.
This means our replacement works if we structure the code like a real PHP application: define and replace in one file, call from another.
$ docker compose run debian php -d extension=modules/funswap.so -r '
require "/tmp/setup.php"; // defines greet() and replaces it
require "/tmp/call.php"; // calls greet(), compiled AFTER replacement
'
When call.php is compiled, greet is already our internal wrapper. The compiler sees an internal function, emits DO_FCALL (or DO_ICALL), and everything works.
In real PHP applications with autoloading and bootstrap files, this is the natural pattern. You’d call \FunSwap\replace() during bootstrap, before any of the calling code is compiled.
But in a single-file test script? Segfault.
The Full Picture
Your funswap.c should now look like this:
/* funswap.c */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "php.h"
#include "ext/standard/info.h"
#include "php_funswap.h"
#include "funswap_arginfo.h"
#include "zend_closures.h"
/* --- Replacement storage (global hack, one at a time) --- */
static zval *stored_replacement = NULL;
/* --- Wrapper handler: calls the stored replacement closure --- */
static void funswap_wrapper_handler(INTERNAL_FUNCTION_PARAMETERS) {
if (!stored_replacement) {
RETURN_NULL();
}
uint32_t argc = ZEND_CALL_NUM_ARGS(execute_data);
zval *args = (argc > 0) ? ZEND_CALL_ARG(execute_data, 1) : NULL;
zval *params = NULL;
if (argc > 0) {
params = emalloc(argc * sizeof(zval));
for (uint32_t i = 0; i < argc; i++) {
ZVAL_COPY(¶ms[i], &args[i]);
}
}
zend_fcall_info fci = empty_fcall_info;
zend_fcall_info_cache fcc = empty_fcall_info_cache;
fci.size = sizeof(fci);
fci.retval = return_value;
fci.param_count = argc;
fci.params = params;
zend_fcall_info_init(stored_replacement, 0, &fci, &fcc, NULL, NULL);
fci.retval = return_value;
fci.params = params;
fci.param_count = argc;
zend_call_function(&fci, &fcc);
for (uint32_t i = 0; i < argc; i++) {
zval_ptr_dtor(¶ms[i]);
}
if (params) efree(params);
}
/* --- Function replacement logic --- */
static int replace_function(zend_string *fn_name, zval *replacement)
{
zend_string *fn_lc = zend_string_tolower(fn_name);
zend_function *original = zend_hash_find_ptr(EG(function_table), fn_lc);
if (!original) {
zend_string_release(fn_lc);
return 0;
}
/* Store the replacement closure */
if (stored_replacement) {
zval_ptr_dtor(stored_replacement);
efree(stored_replacement);
}
stored_replacement = emalloc(sizeof(zval));
ZVAL_COPY(stored_replacement, replacement);
/* Build a wrapper internal function */
zend_internal_function wrapper;
memset(&wrapper, 0, sizeof(wrapper));
wrapper.type = ZEND_INTERNAL_FUNCTION;
wrapper.function_name = zend_string_copy(original->common.function_name);
wrapper.fn_flags = ZEND_ACC_VARIADIC;
wrapper.handler = funswap_wrapper_handler;
wrapper.module = &funswap_module_entry;
/* Replace in function table */
zend_hash_update_mem(EG(function_table), fn_lc, &wrapper,
sizeof(zend_internal_function));
zend_string_release(fn_lc);
return 1;
}
/* --- PHP_FUNCTION(replace) --- */
ZEND_FUNCTION(FunSwap_replace) {
zend_string *function_name;
zval *replacement;
ZEND_PARSE_PARAMETERS_START(2, 2)
Z_PARAM_STR(function_name)
Z_PARAM_OBJECT_OF_CLASS(replacement, zend_ce_closure)
ZEND_PARSE_PARAMETERS_END();
RETURN_BOOL(replace_function(function_name, replacement));
}
/* --- Module lifecycle --- */
PHP_MINIT_FUNCTION(funswap) {
#if defined(ZTS) && defined(COMPILE_DL_FUNSWAP)
ZEND_TSRMLS_CACHE_UPDATE();
#endif
return SUCCESS;
}
PHP_MINFO_FUNCTION(funswap) {
php_info_print_table_start();
php_info_print_table_row(2, "funswap support", "enabled");
php_info_print_table_row(2, "funswap version", PHP_FUNSWAP_VERSION);
php_info_print_table_end();
}
/* --- Module entry --- */
zend_module_entry funswap_module_entry = {
STANDARD_MODULE_HEADER,
"funswap",
ext_functions,
PHP_MINIT(funswap),
NULL,
NULL,
NULL,
PHP_MINFO(funswap),
PHP_FUNSWAP_VERSION,
STANDARD_MODULE_PROPERTIES,
};
#ifdef COMPILE_DL_FUNSWAP
#ifdef ZTS
ZEND_TSRMLS_CACHE_DEFINE()
#endif
ZEND_GET_MODULE(funswap)
#endif
No other files changed since part 1. Everything new lives in funswap.c.
To be continued…
In the next article, we’ll dig deeper into why the Zend VM makes this decision at compile time, what DO_UCALL vs DO_FCALL actually do differently, and whether we can work around it without the two-file pattern.
Leave a Reply