Extending ModSecurity: How to add completely custom WAF functionality

Extending ModSecurity: How to add completely custom WAF functionality

WAF Published on 12 mins Last updated

ModSecurity is a mature and fully-featured web application firewall which we use to power our product’s WAF functionality. Here I outline, step-by-step, how to add new and custom functionality.

ModSecurity Extensions

ModSecurity features a diverse array of variables, operators, and transformations, allowing you to slice, dice, and inspect HTTP traffic however you want. You can accomplish some very clever things with ModSecurity’s rule language, but what happens if the built-in functionality doesn’t do what you need it to? What if you require some entirely custom functionality in order to inspect your HTTP traffic in a meaningful way? If you’re using ModSecurity v2 on Apache* then you’re in luck. ModSecurity can be extended using the Apache module architecture.

*Note: As of the time of writing, this is considered to be the ModSecurity reference platform for the OWASP ModSecurity Core Rule Set project (CRS). This is also the type of ModSecurity implementation that can be found on Loadbalancer.org appliances.

Adding a Custom Transformation Function

Video Walk-through

If you prefer to see things in action, here's a video walk-through of the instructions presented below:

The Goal

In this example, I’m going to add a new transformation function to ModSecurity to calculate the Scrabble score of a variable. This will allow us to block HTTP requests containing query string parameters with a Scrabble score above a chosen threshold.

Desired Functionality:
Score threshold: 20

Request 1:
index.html?param=aardvark
Aardvark (score: 16): come on in!

Request 2:
index.html?param=xylophone
Xylophone (score: 24): get outta here!

This example function is a bit of fun, but hopefully you’ll see how this concept can be used to build totally custom WAF functionality which could be useful when working with real world web applications.

Step 1: Preparation

ModSecurity extensions take the form of Apache modules. This means that any extensions we create are discrete units which can be loaded into our Apache configuration without needing to modify the stock, vanilla version of ModSecurity in any way. Pretty neat!

To proceed, you’ll need the ability to compile custom Apache modules, as well as access to the source code for the exact version of ModSecurity you’re using. If you’ve followed the excellent tutorials on compiling Apache and compiling ModSecurity over at netnea.com then you’ll already have everything you need.

If you don’t have the ModSecurity source code (for the exact version you’re using) then you can grab it from GitHub and put it in an appropriate location:

# mkdir /usr/src/modsecurity
# cd /usr/src/modsecurity/
# wget https://github.com/SpiderLabs/ModSecurity/releases/download/v2.9.4/modsecurity-2.9.4.tar.gz

You can optionally verify the integrity and authenticity of the tarball you’ve just downloaded by using its checksum, like so:

# wget https://github.com/SpiderLabs/ModSecurity/releases/download/v2.9.4/modsecurity-2.9.4.tar.gz.sha256
# sha256sum --check modsecurity-2.9.4.tar.gz.sha256
modsecurity-2.9.4.tar.gz: OK

Extract the contents of the tarball and you’re ready to go:

# tar xf modsecurity-2.9.4.tar.gz

Step 2: Writing the New Transformation Function

The ext directory of the ModSecurity source code contains four example extensions. In particular, the file mod_tfn_reverse.c is useful: it contains a full, working example of an extension that adds a new transformation function (it reverses the input string it’s given). These example files can be used as the basis for creating new extensions.

Let’s start a new file, mod_tfn_scrabble.c, and begin by adding the required file inclusion directives:

#include "httpd.h"
#include "http_core.h"
#include "http_config.h"
#include "http_log.h"
#include "http_protocol.h"
#include "ap_config.h"
#include "apr_optional.h"

#include "modsecurity.h"

Now let’s write the most interesting part: the function that performs the actual transformation. All transformation functions follow the same pattern:

static int scrabble(apr_pool_t *mptmp, unsigned char *input, long int input_len, char **rval, long int *rval_len)
  • mptmp is a pointer to a memory pool we can allocate memory from, if we find ourselves needing a block of memory to use
  • input is a pointer to the input string (to be transformed)
  • input_len is the length of the input string (note that strings aren’t null-terminated in ModSecurity! Length parameters like input_len are used, instead)
  • rval is a pointer to a pointer to a string: we use this to point to the output (transformed) string
  • rval_len is the length of the output string

We’ll use a variable to keep a running total of the score (unimaginatively named score). To look up the score for an individual letter we’ll use a switch statement.

We don’t know whether we’ll be given uppercase or lowercase letters to work with. To keep things simple (and to avoid problems if we mandate the use of t:lowercase with our new transformation but forget to add it to our SecRules), let’s just account for both letter cases in our switch statement.

Finally, we’ll put our switch statement inside of a while statement, which allows us to loop through each letter of the input string. Bringing that all together, it looks like so:

long int score = 0L, i = 0L;

while (i < input_len) {
    switch (input[i]) {
        case 'a': case 'A': case 'e': case 'E': case 'i': case 'I':
        case 'o': case 'O': case 'u': case 'U': case 'l': case 'L':
        case 'n': case 'N': case 's': case 'S': case 't': case 'T':
        case 'r': case 'R':
            score += 1L;
            break;
        case 'd': case 'D': case 'g': case 'G':
            score += 2L;
            break;
        case 'b': case 'B': case 'c': case 'C': case 'm': case 'M':
        case 'p': case 'P':
            score += 3L;
            break;
        case 'f': case 'F': case 'h': case 'H': case 'v': case 'V':
        case 'w': case 'W': case 'y': case 'Y':
            score += 4L;
            break;
        case 'k': case 'K':
            score += 5L;
            break;
        case 'j': case 'J': case 'x': case 'X':
            score += 8L;
            break;
        case 'q': case 'Q': case 'z': case 'Z':
            score += 10L;
            break;
        /* Not a letter? No score increase */
        default:
            break;
    }

    i++;
}

We can now determine the Scrabble score value of the input string. To save time by avoiding having to allocate memory to store the answer in, we can write our answer on top of the original input string (numeric values are returned to ModSecurity as strings). We can use a call of sprintf to write our score as a string, overwriting the input string:

/* Print the total score as a string, directly over the input */
sprintf(input, "%ld", score);

We need to make sure that rval, which is expected to point to the output (transformed) string, is set to point to our answer, i.e. the newly overwritten input string:

/* Return value is just the input string again, now overwritten */
*rval = input;

Finally, we need to make sure that rval_len contains the length of the output string, which can be calculated with a call of strlen, like so:

/* Return value length is length of the score string */
*rval_len = strlen(input);

To close off the function, we need to return an appropriate value. A ModSecurity transformation function should always return 1 if the content being returned is not identical to the input. In our example, the content being returned will always be different to the input, so we can unconditionally return a value of 1:

/* Return 1 if you change the input, and 0 if you don't */
return 1;

There’s one particular scenario we need to account for: what if our function receives a very short input string? Consider the string Z: this would have a total score of 10, so our function would write the string 10 over the input string Z: uh-oh! Our output string is now longer than our input string: overwriting our input string with our result, which is a longer string, is a very bad idea. Consider also that the sprintf function will add a terminating null character to the string it writes, so the three-character long string 10\0 is actually being written over the input string Z.

In this case, we need to ask Apache to allocate a block of memory to hold our result. We can do this with a call of the apr_palloc function, specifying a memory pool from which to allocate this memory block (for this we use the mptmp parameter from our function, which points to a memory pool we can use) and the number of bytes of memory to allocate.

To check for this special case, we can test if the input string has a length less than 3 (i.e. it’s one or two characters long) and we can assume the worst case scenario: a two-digit score requiring three characters to write out (digit, digit, null character). Bringing this all together, it looks like:

/* Check for short input string scenario. Assume worst case scenario: a
 * one or two character-long input string could have a two-digit score
 * (e.g. "Z" => "10") which sprintf would store as three characters
 * (digit-digit-null), requiring three bytes of memory */
if (input_len < 3) {
    /* Allocate memory to store a three byte string (worst case) */
    *rval = (char *) apr_palloc(mptmp, 3);

    /* Print the total score as a string into allocated memory */
    sprintf(*rval, "%ld", score);
    /* Return value length is length of the score string */
    *rval_len = strlen(*rval);

    return 1;
}

We can put this if statement directly beneath our switch statement. And with that, our main transformation function is complete!

Step 3: Adding the Boilerplate Parts

We now need to register our function with ModSecurity before it can be used. To do this, we use a separate registration function, which looks like so:

/**
 * Register transformation function with ModSecurity.
 */
static int hook_pre_config(apr_pool_t *mp, apr_pool_t *mp_log, apr_pool_t *mp_temp)
{
    void (*fn)(const char *name, void *fn);

    /* Look for the registration function exported by ModSecurity. */
    fn = APR_RETRIEVE_OPTIONAL_FN(modsec_register_tfn);

    if (fn) {
        /* Use it to register our new transformation function under the
         * name "scrabble". */
        fn("scrabble", (void *) scrabble);
    } else {
        ap_log_error(APLOG_MARK, APLOG_ERR | APLOG_NOERRNO, 0, NULL,
            "mod_tfn_scrabble: Unable to find modsec_register_tfn.");
    }

    return OK;
}

Function pointers can make code look more complex than it actually is. The important part here is that we’re registering our transformation function under the name “scrabble”: we will later use this name when using our new transformation in a SecRule.

The last two pieces of boilerplate that we need to add are a register_hooks function and a special structure that indicates to Apache that this is a module. These look like the following:

/**
 * Register to be invoked before configuration begins.
 */
static void register_hooks(apr_pool_t *p)
{
    ap_hook_pre_config(hook_pre_config, NULL, NULL, APR_HOOK_LAST);
}

/**
 * This structure is used by Apache to determine that a dynamic
 * library it is loading is a genuine module.
 */
module AP_MODULE_DECLARE_DATA tfn_scrabble_module = {
    STANDARD20_MODULE_STUFF,
    NULL,                  /* create per-dir    config structures */
    NULL,                  /* merge  per-dir    config structures */
    NULL,                  /* create per-server config structures */
    NULL,                  /* merge  per-server config structures */
    NULL,                  /* table of config file commands       */
    register_hooks         /* register hooks                      */
};

The name of the module structure is important: this is the name that will be used in the Apache configuration later to load our module.

Step 4: Compilation

If all is well, our new function can now be compiled as an Apache module! This is done using the apxs tool, which can be executed like so:

/apache/bin/apxs -ci -I/usr/src/modsecurity/modsecurity-2.9.4/apache2/ -I/usr/include/libxml2/ mod_security_template.c

The paths given should be changed as appropriate to point to the directories where your ModSecurity and libxml2 source code files are located.

It’s also possible to use the -a switch when using the apxs command. This will cause apxs to attempt to automatically add a line to your Apache configuration file to load your new module. This is easy enough to do by hand, and would look like the following:

LoadModule tfn_scrabble_module modules/mod_tfn_scrabble.so

Step 5: Testing the New Module (The Xylophone Test)

We can now write a simple SecRule to test out our new transformation function. Let's add the following rule to our Apache configuration:

SecRule ARGS “@ge 20” \
    “id:1,\
    phase:1,\
    deny,\
    t:none,t:scrabble,\
    log,\
    msg:'%{MATCHED_VAR_NAME} met or exceeded the Scrabble score threshold: score = %{MATCHED_VAR}'”

This rule will deny a request if one of its arguments, e.g. a query string parameter, has a Scrabble score greater than or equal to 20. It will also write the Scrabble score that was observed to the log, if the rule matches. We can easily test this rule using cURL.

First, let's test sending a request with the word "aardvark", which we saw earlier has a Scrabble score of 16 (less than our threshold of 20). We expect this request to be allowed by ModSecurity:

$ curl -o /dev/null -v 192.168.85.188/index.html?test=aardvark
*   Trying 192.168.85.188:80...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Connected to 192.168.85.188 (192.168.85.188) port 80 (#0)
> GET /index.html?test=aardvark HTTP/1.1
> Host: 192.168.85.188
> User-Agent: curl/7.78.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Tue, 10 Aug 2021 15:51:44 GMT
< Server: Apache
< Last-Modified: Mon, 11 Jun 2007 18:53:14 GMT
< ETag: "2d-432a5e4a73a80"
< Accept-Ranges: bytes
< Content-Length: 45
< 
{ [45 bytes data]
100    45  100    45    0     0  10518      0 --:--:-- --:--:-- --:--:-- 15000
* Connection #0 to host 192.168.85.188 left intact

Success! Note the HTTP/1.1 200 OK line, indicating that our request made it through ModSecurity without being denied.

Now let's test sending a request with the word "xylophone", which has a score of 24 (exceeding our threshold of 20). We're expecting this request to be denied by ModSecurity:

$ curl -o /dev/null -v 192.168.85.188/index.html?test=xylophone
*   Trying 192.168.85.188:80...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Connected to 192.168.85.188 (192.168.85.188) port 80 (#0)
> GET /index.html?test=xylophone HTTP/1.1
> Host: 192.168.85.188
> User-Agent: curl/7.78.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 403 Forbidden
< Date: Tue, 10 Aug 2021 15:51:53 GMT
< Server: Apache
< Content-Length: 199
< Content-Type: text/html; charset=iso-8859-1
< 
{ [199 bytes data]
100   199  100   199    0     0  31196      0 --:--:-- --:--:-- --:--:-- 33166
* Connection #0 to host 192.168.85.188 left intact

It works! The line HTTP/1.1 403 Forbidden indicates that the request was denied by ModSecurity, as we hoped. This means that our test SecRule using our new transformation function is working! We can verify this by taking a look at the Apache error log, presented here with added whitespace and newlines for readability:

[2021-08-10 11:51:53.388582] [-:error] 192.168.85.1:53598 YRKgmT1xHMwv4Vbe2Xj15gAAABg
    [client 192.168.85.1]
    ModSecurity: Access denied with code 403 (phase 1). Operator GT matched 20 at ARGS:test.
    [file "/opt/apache-2.4.48/conf/httpd.conf"]
    [line "145"]
    [id "1"]
    [msg "ARGS:test met or exceeded the Scrabble score threshold: score = 24"]
    [tag "Local Test Service"]
    [hostname "localhost"]
    [uri "/index.html"]
    [unique_id "YRKgmT1xHMwv4Vbe2Xj15gAAABg"]

This log entry tells us that ModSecurity correctly calculated the Scrabble score: [msg "ARGS:test met or exceeded the Scrabble score threshold: score = 24"].

C vs Lua: Can we just use a Lua script instead?

ModSecurity features built-in support for the Lua programming language, allowing complex rule logic to be written and executed in the form of Lua scripts. A valid question is: "Can we use a Lua script to perform this Scrabble transformation, rather than writing and compiling an Apache module?"

The answer is yes! As an experiment, I wrote a Lua script which had the same effect as the custom transformation function (Apache module) described above. To get an idea of the relative performance of the C transformation and the Lua script, I crafted the following HTTP request which contains 500 query string parameters, each a ten letter word:

GET /index.html?0=abandoning&1=abasements&2=abatements...&497=beaujolais&498=beauregard&499=beautified HTTP/1.1

I submitted the test request 1000 times. The only SecRules in place were the modsecurity.conf-recommended rules and the SecRule ARGS “@ge 20” Scrabble test rule: no Core Rule Set or any other rules were in place to affect the test by slowing down ModSecurity's processing.

I examined the Stopwatch2 line in the audit log and observed how long was being spent in phase 1 (which is where the Scrabble test rule was taking place). I repeated this test three times: with the Scrabble rule commented out; using the C transformation function; using the Lua script. The results were as follows:

No Scrabble rule C transformation Lua script
Mean time spent in phase 1 (µs) 9 2069 23994
Median time spent in phase 1 (µs) 8 2315 26487

It can be seen that the C transformation was more than an order of magnitude faster than the Lua script. When performance is the biggest consideration and you have the right skill set available, the C transformation option is the clear choice. This would be especially true for complex processing or a function that must be called a significant number of times per request.

In fairness to the Lua option, the Lua script I wrote can almost certainly be improved and optimised by someone with experience in writing Lua code. The flexibility and relative simplicity of the Lua route may also make it an attractive option in some scenarios. For non-Apache based ModSecurity implementations, as well as libmodsecurity / ModSecurity v3 implementations, the Lua approach would be the only option.

The Limit is Imagination

The example transformation presented here was a fun little experiment to create and test. It demonstrated how truly flexible ModSecurity can be. All it takes is imagination to think of some novel ways that ModSecurity can be extended (how about a book seller that wants to verify ISBNs in requests hitting their stock system?).

If you have a real world scenario that needs to process and inspect HTTP traffic in a unique way then we'd love to hear from you! Likewise, if you've written any ModSecurity extensions or Lua scripts yourself then it would be great to share stories.