Putting It All Together — Building a PHP Extension That Does Nothing (Useful) — part 4

Over three parts, we’ve built a PHP extension from scratch and discovered that replacing functions in PHP’s function table is both simpler and harder than it looks. We replaced strlen on the first try, segfaulted on user functions, learned about DO_UCALL vs DO_FCALL, discovered runtime cache slots, and found the two-file pattern that makes everything work.

But our code is a mess. A global stored_replacement pointer. One replacement at a time. No cleanup. No tests. Time to fix that.

The Problems With Our Current Code

Looking at funswap.c from part 2, there are several issues:

  1. One replacement at a time: stored_replacement is a single global pointer. Replace two functions and the first replacement is lost.
  2. No cleanup: we never free the stored replacement or the copied original. Memory leaks every request.
  3. No way to map wrapper to replacement: all wrappers share the same handler, which reads the same global. If we replace both strlen and strtolower, they both call the same closure.
  4. No tests: we’ve been testing manually with php -r.

A Proper Registry

Instead of a global pointer, we need a hash table that maps function names to their replacement closures. When the wrapper handler fires, it looks up which replacement to call based on its own function name.

First, update the header to add module globals.

php_funswap.h, add globals:

/* php_funswap.h — updated with module globals */
#ifndef PHP_FUNSWAP_H
#define PHP_FUNSWAP_H

#include "php.h"

extern zend_module_entry funswap_module_entry;
#define phpext_funswap_ptr &funswap_module_entry

#define PHP_FUNSWAP_VERSION "0.1.0"

ZEND_BEGIN_MODULE_GLOBALS(funswap)
    HashTable *replacements;   /* fn_name_lc → zval (Closure) */
    HashTable *originals;      /* fn_name_lc → zend_function* */
ZEND_END_MODULE_GLOBALS(funswap)

ZEND_EXTERN_MODULE_GLOBALS(funswap)

#define FS_G(v) ZEND_MODULE_GLOBALS_ACCESSOR(funswap, v)

#if defined(ZTS) && defined(COMPILE_DL_FUNSWAP)
ZEND_TSRMLS_CACHE_EXTERN()
#endif

#endif

Two hash tables: replacements maps lowercased function names to the replacement Closure zvals, originals maps them to stored zend_function* copies. FS_G() is our globals accessor macro.

Now rewrite funswap.c to use these:

funswap.c, complete rewrite:

/* funswap.c */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif

#include "php.h"
#include "ext/standard/info.h"
#include "php_funswap.h"
#include "funswap_arginfo.h"
#include "zend_closures.h"
#include "zend_compile.h" /* function_add_ref, destroy_op_array */

funswap.c, declare module globals:

/* funswap.c — continued */
ZEND_DECLARE_MODULE_GLOBALS(funswap)

funswap.c, the wrapper handler now looks up the replacement by function name:

/* funswap.c — continued */
static void funswap_wrapper_handler(INTERNAL_FUNCTION_PARAMETERS)
{
    zend_string *fn_name = execute_data->func->common.function_name;
    zend_string *fn_lc = zend_string_tolower(fn_name);

    /* Find the replacement closure for this function */
    zval *replacement = zend_hash_find(FS_G(replacements), fn_lc);
    zend_string_release(fn_lc);

    if (!replacement) {
        RETURN_NULL();
    }

    /* Forward all arguments to the replacement */
    uint32_t argc = ZEND_CALL_NUM_ARGS(execute_data);
    zval *args = (argc > 0) ? ZEND_CALL_ARG(execute_data, 1) : NULL;

    zval *params = NULL;
    if (argc > 0) {
        params = emalloc(argc * sizeof(zval));
        for (uint32_t i = 0; i < argc; i++) {
            ZVAL_COPY(&params[i], &args[i]);
        }
    }

    zend_fcall_info fci = empty_fcall_info;
    zend_fcall_info_cache fcc = empty_fcall_info_cache;

    fci.size = sizeof(fci);
    fci.retval = return_value;
    fci.param_count = argc;
    fci.params = params;

    zend_fcall_info_init(replacement, 0, &fci, &fcc, NULL, NULL);
    fci.retval = return_value;
    fci.params = params;
    fci.param_count = argc;

    zend_call_function(&fci, &fcc);

    for (uint32_t i = 0; i < argc; i++) {
        zval_ptr_dtor(&params[i]);
    }
    if (params) efree(params);
}

Each wrapper instance has the same handler, but execute_data->func->common.function_name tells it which function was called. It looks up the matching replacement in the hash table. Now we can replace as many functions as we want.

funswap.c, the replacement logic:

/* funswap.c — continued */
static int replace_function(zend_string *fn_name, zval *replacement)
{
    zend_string *fn_lc = zend_string_tolower(fn_name);

    zend_function *original = zend_hash_find_ptr(EG(function_table), fn_lc);
    if (!original) {
        zend_string_release(fn_lc);
        return 0;
    }

    /* Copy and store the original */
    size_t fn_size = (original->type == ZEND_USER_FUNCTION)
                         ? sizeof(zend_op_array)
                         : sizeof(zend_internal_function);
    zend_function *original_copy = emalloc(fn_size);
    memcpy(original_copy, original, fn_size);
    if (original->type == ZEND_USER_FUNCTION) {
        function_add_ref(original_copy);
    }

    zval orig_zv;
    ZVAL_PTR(&orig_zv, original_copy);
    zend_hash_update(FS_G(originals), fn_lc, &orig_zv);

    /* Store the replacement closure */
    zval repl_copy;
    ZVAL_COPY(&repl_copy, replacement);
    zend_hash_update(FS_G(replacements), fn_lc, &repl_copy);

    /* Build the wrapper */
    zend_internal_function wrapper;
    memset(&wrapper, 0, sizeof(wrapper));
    wrapper.type = ZEND_INTERNAL_FUNCTION;
    wrapper.function_name = zend_string_copy(original->common.function_name);
    wrapper.fn_flags = ZEND_ACC_VARIADIC
                     | (original->common.fn_flags
                        & (ZEND_ACC_PUBLIC | ZEND_ACC_PROTECTED
                           | ZEND_ACC_PRIVATE | ZEND_ACC_STATIC
                           | ZEND_ACC_RETURN_REFERENCE));
    wrapper.handler = funswap_wrapper_handler;
    wrapper.module = &funswap_module_entry;

    /* Replace */
    zend_hash_update_mem(EG(function_table), fn_lc, &wrapper,
                         sizeof(zend_internal_function));
    zend_string_release(fn_lc);
    return 1;
}

Note: Our wrapper uses ZEND_ACC_VARIADIC and sets num_args = 0, so it accepts
any arguments at the C level. But ReflectionFunction on the replaced function will
show 0 parameters. If you need Reflection to report the original signature, copy
original->common.arg_info and original->common.num_args to the wrapper. For this
series, we skip that. Keep it in mind for production use.

funswap.c, the PHP function:

/* funswap.c — continued */
ZEND_FUNCTION(FunSwap_replace) {
    zend_string *function_name;
    zval *replacement;

    ZEND_PARSE_PARAMETERS_START(2, 2)
        Z_PARAM_STR(function_name)
        Z_PARAM_OBJECT_OF_CLASS(replacement, zend_ce_closure)
    ZEND_PARSE_PARAMETERS_END();

    RETURN_BOOL(replace_function(function_name, replacement));
}

Lifecycle: Init and Cleanup

The hash tables need to be created at request start and destroyed at request end. This is what RINIT and RSHUTDOWN are for:

funswap.c, lifecycle functions:

/* funswap.c — continued */
PHP_GINIT_FUNCTION(funswap) {
    ZEND_SECURE_ZERO(funswap_globals, sizeof(*funswap_globals));
}

PHP_RINIT_FUNCTION(funswap) {
#if defined(ZTS) && defined(COMPILE_DL_FUNSWAP)
    ZEND_TSRMLS_CACHE_UPDATE();
#endif
    ALLOC_HASHTABLE(FS_G(replacements));
    zend_hash_init(FS_G(replacements), 8, NULL,
                   (dtor_func_t)zval_ptr_dtor, 0);

    ALLOC_HASHTABLE(FS_G(originals));
    zend_hash_init(FS_G(originals), 8, NULL, NULL, 0);
    return SUCCESS;
}

PHP_RSHUTDOWN_FUNCTION(funswap) {
    /* Free replacement closures */
    if (FS_G(replacements)) {
        zend_hash_destroy(FS_G(replacements));
        FREE_HASHTABLE(FS_G(replacements));
        FS_G(replacements) = NULL;
    }

    /* Free stored originals */
    if (FS_G(originals)) {
        zval *zv;
        ZEND_HASH_FOREACH_VAL(FS_G(originals), zv) {
            zend_function *fn = Z_PTR_P(zv);
            if (fn->type == ZEND_USER_FUNCTION) {
                destroy_op_array(&fn->op_array);
            }
            efree(fn);
        } ZEND_HASH_FOREACH_END();
        zend_hash_destroy(FS_G(originals));
        FREE_HASHTABLE(FS_G(originals));
        FS_G(originals) = NULL;
    }
    return SUCCESS;
}

PHP_MINIT_FUNCTION(funswap) {
#if defined(ZTS) && defined(COMPILE_DL_FUNSWAP)
    ZEND_TSRMLS_CACHE_UPDATE();
#endif
    return SUCCESS;
}

PHP_MINFO_FUNCTION(funswap) {
    php_info_print_table_start();
    php_info_print_table_row(2, "funswap support", "enabled");
    php_info_print_table_row(2, "funswap version", PHP_FUNSWAP_VERSION);
    php_info_print_table_end();
}

funswap.c, updated module entry (note the added RINIT, RSHUTDOWN, GINIT):

/* funswap.c — continued */
zend_module_entry funswap_module_entry = {
    STANDARD_MODULE_HEADER,
    "funswap",
    ext_functions,
    PHP_MINIT(funswap),
    NULL,                       /* MSHUTDOWN */
    PHP_RINIT(funswap),
    PHP_RSHUTDOWN(funswap),
    PHP_MINFO(funswap),
    PHP_FUNSWAP_VERSION,
    PHP_MODULE_GLOBALS(funswap),
    PHP_GINIT(funswap),
    NULL,                       /* GSHUTDOWN */
    NULL,                       /* post deactivate */
    STANDARD_MODULE_PROPERTIES_EX,  /* note: _EX, not plain */
};

#ifdef COMPILE_DL_FUNSWAP
#ifdef ZTS
ZEND_TSRMLS_CACHE_DEFINE()
#endif
ZEND_GET_MODULE(funswap)
#endif

Note the change from STANDARD_MODULE_PROPERTIES to STANDARD_MODULE_PROPERTIES_EX. This is required when using PHP_MODULE_GLOBALS and PHP_GINIT. Without _EX, the struct layout is wrong and PHP crashes on load. That one took me a while to figure out.

Writing .phpt Tests

PHP extensions use .phpt files for tests. Each test is a file with sections: --TEST-- (name), --EXTENSIONS-- (required extensions), --FILE-- (PHP code), --EXPECT-- (expected output).

Remember the two-file pattern: define and replace in a setup file, call from the test file via require.

ext/tests/replace_internal.phpt, replacing an internal function:

--TEST--
FunSwap: replace internal function (strlen)
--EXTENSIONS--
funswap
--FILE--
<?php
\FunSwap\replace("strlen", function(string $s): int {
    return 999;
});

$code = '<?php echo strlen("hello") . "\n";';
$tmp = tempnam(sys_get_temp_dir(), 'fs');
file_put_contents($tmp, $code);
require $tmp;
unlink($tmp);
?>
--EXPECT--
999

You might wonder that for internal functions like strlen, the compiler emits DO_ICALL when it sees strlen in the function table. After our replacement, the function is still internal (our wrapper is ZEND_INTERNAL_FUNCTION), so DO_ICALL still works. We don’t actually need the two-file pattern for internal→internal replacement. But using it consistently doesn’t hurt, and it makes the tests uniform.

ext/tests/replace_multiple.phpt, multiple replacements:

--TEST--
FunSwap: replace multiple functions independently
--EXTENSIONS--
funswap
--FILE--
<?php
\FunSwap\replace("strlen", function(string $s): int {
    return 111;
});

\FunSwap\replace("strtolower", function(string $s): string {
    return "NOPE";
});

$code = '<?php
echo strlen("hello") . "\n";
echo strtolower("HELLO") . "\n";
';
$tmp = tempnam(sys_get_temp_dir(), 'fs');
file_put_contents($tmp, $code);
require $tmp;
unlink($tmp);
?>
--EXPECT--
111
NOPE

ext/tests/replace_user_function.phpt, replacing a user function:

--TEST--
FunSwap: replace user function (two-file pattern)
--EXTENSIONS--
funswap
--FILE--
<?php
function greet(string $name): string {
    return "Hello, $name!";
}

\FunSwap\replace("greet", function(string $name): string {
    return "Yo, $name!";
});

$code = '<?php echo greet("World") . "\n";';
$tmp = tempnam(sys_get_temp_dir(), 'fs');
file_put_contents($tmp, $code);
require $tmp;
unlink($tmp);
?>
--EXPECT--
Yo, World!

Run them:

$ docker compose run debian php run-tests.php \
    -d extension=modules/funswap.so tests/*.phpt

Running selected tests.
PASS FunSwap: replace internal function (strlen) [tests/replace_internal.phpt]
PASS FunSwap: replace multiple functions independently [tests/replace_multiple.phpt]
PASS FunSwap: replace user function (two-file pattern) [tests/replace_user_function.phpt]

Number of tests :    3                3
Tests passed    :    3 (100.0%)

Green. All three pass.

What We Built

Over four articles, starting from zero C extension experience, we built funswap, a PHP extension that can replace any function at runtime with a closure. Here’s what we learned along the way:

Part 1: The PHP extension lifecycle. Headers, module entry, stubs, arginfo, Docker build.

Part 2: Function table replacement works for internal functions. User functions segfault. The crash teaches us about zend_function as a union of zend_op_array and zend_internal_function.

Part 3: The segfault isn’t a bug in our code. It’s the Zend VM compiling DO_UCALL opcodes at compile time, before our runtime replacement happens. The fix is the two-file pattern. The Observer API can’t skip function execution. Runtime cache slots add another layer of “replace before first call.”

Part 4: Proper registry with module globals, per-request lifecycle, memory cleanup, and .phpt tests. The extension works.

The complete extension code is available on GitHub: [TODO: add repo link]. Clone it, make build-image && make build && make test, and you’ve got a working function replacement extension.

Where This Goes

We built the simplest version: replace a function with a closure. But there’s more to explore:

  • What if the replacement wants to call the original? You’d need to pass the original function as a callable. Creating closures from C with attached context is its own adventure.
  • What about class methods? The function table lives on the class entry, not EG(function_table). Interface observation means walking class hierarchies.
  • What about stacking multiple replacements? That’s middleware. Each layer wraps the previous one.

But those are stories for another time. For now, we have a working extension, a deeper understanding of how PHP executes function calls, and a healthy respect for DO_UCALL.

Thanks for reading. If you found this series useful, the best thing you can do is try building your own extension. Even a small one that does nothing useful. That’s how this whole thing started.

Leave a Reply

Your email address will not be published. Required fields are marked *