Implementing a more robust solution for deprecating counters in ModSecurity.
Loadbalancer.org loves free and open-source software!
We found a problem, investigated its cause, and are publishing our solution to give back to the wider community (the Lua script described in this blog post can be found on GitHub here).
When we work together and share our contributions, we all get stronger and the software we use gets better. Also, this might help somebody else out of a similar situation one day…
Read on to find out how ModSecurity ‘cools off’ variables under the hood, understand the implications arising from this implementation, and for a link to our workaround script which is capable of handling far more complex 'user session' scenarios with ModSecurity.
In the world of web application security, it can be invaluable to consider a user's behaviour across the entire duration of their web app session. This brings "the bigger picture" into view. It allows for more intelligent decisions to be made using a wider context: what ongoing behaviours look malicious and should be blocked? The alternative is to examine each HTTP request in isolation, offering only a limited, instantaneous view of what's happening.
ModSecurity (2.x), which powers the WAF functionality on our load balancer, comes with built-in features for storing and updating session based data over time. Variables can be used as counters to track the number of times a user accesses certain resources or submits particular kinds of requests. For example, a counter could track how many times a user submits credentials to a login page. Combining these counters with ModSecurity's rule logic allows for thresholds and trigger actions to be defined, creating a powerful tool to block anomalous-looking traffic. Brute force attempts to guess passwords or usernames can be stopped by blocking an IP address that exceeds a set threshold for an allowed number of login attempts.
A key component to this 'counter and threshold' approach is to "cool off" counter variables over time. This prevents counters from increasing forever: this would otherwise cause genuine, well behaved users to eventually exceed a set threshold, however long that may take, and be blocked in error. It's important to be able to tell the difference between a user accessing a login portal 20 times in an hour and 20 times in 5 seconds! To this end, ModSecurity provides the
deprecatevar action to gradually decrease variable counters over time.
For complex setups tracking and deprecating multiple counters per user session, the
deprecatevar action has been found to produce unexpected results. Crucially, deprecate actions for different variables can interfere with each other and prevent counters from decreasing as expected. A more robust solution is required to handle scenarios that push ModSecurity's built-in actions to their limits.
The Problem in Short
Looking at ModSecurity's inner workings, the
deprecatevar action in ModSecurity (2.x) uses a per-record timestamp to calculate how much a given variable should be decremented by. This can cause unexpected behaviour when records hold multiple variables, particularly when different variables are set to decrement at different rates (for example, having a ‘fast cool off counter’ and a ‘slow cool off counter’ in the same collection.)
ModSecurity's Persistent Storage
To give it the ability to look beyond isolated HTTP transactions, ModSecurity provides persistent storage: a simple database mechanism allowing variables to be stored and updated over time across multiple transactions. This mechanism makes it possible to track user behaviours on a per-session basis over an extended period of time.
Five predefined persistent storage collections (database tables) are available for use within ModSecurity, intended for slightly different use cases, for example tracking information per-IP address or tracking per-session ID. A record is created in a given collection for a unique key that is seen, for example when a new IP address makes a request. ModSecurity provides actions for creating and updating variables in a record, such as creating or increasing a counter associated with a specific IP address.
Expiring and “Cooling Off” Variables
When defining a new variable in ModSecurity, it's a very good idea to set a variable expiry time using the
expirevar action. This ensures that the variable will be automatically unset after a defined period of inactivity. This is useful, for example, to block clients for fixed periods of time if they meet specified conditions. Consider defining the variable
client_is_blocked=1 and setting it to expire after 600 seconds: this would enforce a 10 minute block on a given user.
In addition to the expire action, ModSecurity also provides the
deprecatevar action which 'cools off' variables over time. For example, the action
deprecatevar:IP.my_counter=2/10 instructs ModSecurity to decrease the
my_counter variable by 2 every 10 seconds. This action is useful as it stops counters from increasing endlessly.
The ModSecurity SDBM utility allows the contents of the persistent collection files to be dumped in plain text on the command line. This provides useful insight into how ModSecurity's actions work. Here's an example from a simple test, looking at the IP collection after a single variable has been set using
As can be seen, many housekeeping variables are stored behind the scenes to enable persistent collections to function correctly. The penultimate line shows that the user configured
my_counter variable has been stored as intended.
Of particular interest is the variable on the final line:
LAST_UPDATE_TIME. This is an automatically defined 'housekeeping' variable which stores a timestamp, in Unix time, representing when the record was last updated. This timestamp is used by the
deprecatevar action to determine whether enough time has passed for a given variable to be deprecated. It is also used to determine how much a variable should be deprecated. For example, if evaluating
current_time - LAST_UPDATE_TIME
produces 34 and the variable in question is set to deprecate by 2 every 10 seconds then the variable will be decreased by at total of 6 (because 3 whole 'deprecation periods' have passed since
Thanks to the power of free and open-source software, ModSecurity's source code can be examined to verify how
deprecatevar works. The following code confirms the use of the update timestamp (in the context of the full source file), and can be found in
Here's the rub: when the
deprecatevar action is executed, if enough time has passed for a given variable to be deprecated then the variable's value is updated accordingly. Updating a variable means that the collection record has been updated, and so the
LAST_UPDATE_TIME variable is also updated: its value is set to the current time.
What if the collection contains other variables that also rely on the value of
LAST_UPDATE_TIME for the purposes of deprecation? The act of deprecating one variable changes the timestamp that all variables in the collection rely on for the purposes of calculating their deprecation!
The key limitation here is that all
deprecatevar actions for all variables in a given record are tied to the same timestamp. If more than one variable in a collection has an associated
deprecatevar action defined then unexpected behaviour can arise, which may be very difficult to diagnose.
This limitation is particularly pronounced when multiple variables are set to be decremented at different rates. For example, a particularly active variable set to decrement every 2 seconds could prevent other, slower variables, perhaps set to decrement every 20 seconds, from decreasing in value. In the worst case scenario, the activity from the "busy" '2 second' counter would prevent the '20 second' counters from ever being decremented. In a 'counter and threshold' style setup, users' counters would eventually hit their associated thresholds and cause users to be blocked for no immediately obvious reason.
Lastly, another problem is that unconditional variable deprecation is impossible. For example, if a given variable must be decremented by a value of 5 every 10 seconds, no matter what, the act of modifying the associated record in any way (not necessarily relating to the counter in question) causes
LAST_UPDATE_TIME to be reset, hence restarting the 10 second 'cool down' period for the
deprecatevar action. This makes it impossible to guarantee that a given counter will cool off at a constant rate over time. Not only must the given counter be idle but the entire record in question must be idle for at least the 10 second period, in this example. A scenario where users must be guaranteed to be allowed 10 login attempts every 60 seconds, for example, would require having a constant rate of deprecation for the associated counter. The
deprecatevar action cannot reliably achieve this.
Use More Collections
A simple way around the shared timestamp issue is to split up counter variables across multiple collections. The persistent storage mechanism provides five collections to use, meaning unused collections could be used to hold and handle different counters completely separately. This fix isn't possible if there are more counter variables than spare collections, however.
Timestamps for Everyone!
The core of the problem presented here is that the deprecation logic, which is applied to all variables, is tied to a single, shared timestamp. Perhaps the most self-evident solution is to fix this by using unique, per variable timestamps.
How Not to Work with Timestamps
The first idea for this workaround used ModSecurity SecRules to set a timestamp variable for each counter variable that needed to be decremented over time. That's easy enough to do: ModSecurity's variable expansion allows a variable to be set to the value of
TIME_EPOCH, creating a new timestamp.
It's advisable not to perform any arithmetic on these new timestamp variables themselves, if they're being kept in persistent storage. Doing so causes odd behaviour to occur which is difficult to diagnose (the author had to examine debug logs set to maximum verbosity to understand the issue.) ModSecurity updates variables (in persistent storage) in a roundabout way by calculating and applying the delta (change) to their values. It gets confused easily once things become slightly complex, for example adding to variables in one location, subtracting from them somewhere else later on, and so on. As an alternative, any necessary arithmetic should be performed using transaction variables (in the
TX collection) and then actions should be taken accordingly.
deprecatevar action using SecRules, as outlined above, is mostly possible: the action can be approximated by pre-calculating a series of constants. That approach works, for a limited range of values, but it's very ugly, verbose, and difficult to maintain. The main difficulty here is that ModSecurity doesn't expose any way to perform multiplication or division operations.
This is where Lua steps in to save the day. The same logic that requires a table of constants and dozens of SecRules to approximate can be accomplished precisely with a single line of Lua code:
The Lua script can be found on GitHub here
The full Lua script involves some preparatory work and accounting for certain scenarios, but, on the whole, it's fairly simple. The script takes the name of the variable to be decremented and assigns that variable its own unique timestamp, if one doesn't already exist. The script then uses the variable's own timestamp to determine whether enough time has elapsed to warrant decrementing the variable and, if so, by how much. The variable is updated in persistent storage if it needs to be decremented.
When calling the Lua script, the script needs to be passed the name of the variable to decrement, the interval between decrement actions (in seconds), and the amount to decrement the variable by each time. Transaction variables are used to pass these 'arguments' to the function in the script. Using a SecRule to call the new Lua decrement action looks like the following:
Note that a
SecActiondirective could also be used, but using
SecRuledirectives allows the Lua script to only be called when it's definitely required. This is useful in light of performance considerations discussed later.
The result of using this Lua script is that variable counters can be decremented independently, without the risk of different counters interfering with each other. When using the script, variables to be decremented have their own unique timestamp associated to them:
As an aside, when writing Lua scripts for use with ModSecurity, don't allow negative values to be assigned to ModSecurity variables. The result is confusing: ModSecurity interprets assigning a negative value as the '=-' operator. For example, assigning -4 to
varis interpreted as
var=-4: var is decremented by 4 when one might be expecting -4 to be rounded up to 0, leaving var set to 0.
The script could likely be made more efficient. This was the author's first exploration of scripting with Lua, so there are likely to be improvements that could be made (for example, using Lua's native
os.time() function may be more efficient than pulling in the
TIME_EPOCH variable from ModSecurity.)
Even though ModSecurity has native Lua support, invoking the Lua script is slow compared to the built-in
deprecatevar action. Testing on a virtual machine with 1 vCPU, the Lua script took approximately 500 μs to process per invocation (for comparison, on the same test system, the
deprecatevar action was observed to take less than 1 μs to complete per action.) This could be an issue, for example, if dozens of separate counters required the script to be repeatedly invoked for every request.
One idea to improve performance is, instead of calling the same Lua script multiple times (once for each variable), to use a single script configured to read and update multiple variables within a single execution. That would avoid repeating the (presumed) overhead of starting the Lua engine and executing a script.
The dream would be to implement this new deprecation logic directly in ModSecurity. Writing a new action to deprecate counters on a per-variable basis should allow this to be performed as quickly as the default
The limitation described here appears to be a fairly niche corner case. Owing to ModSecurity's free and open-source nature, it was possible to determine the cause and extent of the limitation. Using the built-in Lua engine, a new, more robust deprecate action was developed, able to handle more complex setups requiring the deprecation of multiple counters within the same collection.
It took a long time and much testing to understand the nuances of this issue and why it was happening. Hopefully, this information and the new script may be useful to the community, and might help someone encountering a similar problem in the future.