There are a million and five examples of braindead image leech protection using mod_rewrite out there, but mod_rewrite is total overkill in situations where you don’t actually want to rewrite anything. Its logging facilities also leave much to be desired, and the whole nasty process can easily consume more resources than it saves—which kinda defeats the purpose.
A typical specimen will look something like this:
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.biz/.*$ [NC]
RewriteRule ^.*\.(gif|jpg|png)$ - [F]
So now you’re throwing out an [F] whenever a request for an image file comes through with a referrer that isn’t blank and that doesn’t match against your complicated domain whitelist. That lets a 403 response bubble up to Apache, which then has to look through its ErrorDocument declarations, see if one matches the 403 code, and suddenly you’re generating another subrequest for another file. And what if that 403.shtml
SSI file that you forgot was declared in your default config isn’t actually there? Apache’s suddenly processing (and logging) a 404 on top of that, which makes no damn sense at all. Wouldn’t it be better if you could just drop the connection entirely and get a log full of information detailing why it happened? That’s where ModSecurity excels.
Where ModSecurity does not excel is in its documentation or rule definition DSL. Most of the examples out there are terribly outdated, and while there is a new book out on the subject, it sadly lacks concrete demonstrations of lengthy rule chaining, and contains quite a few grammatical errors that make parts of it difficult to get through. It’s also from the same publisher that brought you Programming Lua
—a rather expensive tome whose flimsy, smudged pages appear to have been run off on a laser printer with an old toner cartridge, leading one to wonder where all that money is going.
One look at the tragic pastel ninja gracing the cover of the Modsecurity Handbook
will almost certainly lead one to conclude that no portion of its proceeds are being spent on life drawing classes or a competent art director’s salary, but, a grossly overpriced book with an embarrassingly awful cover illustration is better than no book at all; you can always rip the cover off if you’re worried your boss might fire you for reading anime comics at work.
Armed with its description of ModSecurity’s rule definition language (and a lot of trial-and-error) I believe I was able to successfully translate the venerable old mod_rewrite recipe into ModSecurity’s domain.
Here’s the config that I came up with:
SecRule HTTP_REFERER "!@rx ^$" phase:1,chain,auditlog,drop
SecRule HTTP_REFERER "!@pm mydomain.com mydomain.biz" chain
SecRule REQUEST_FILENAME "@rx \.(gif|jpg|png)$"
This logic should look amply familiar to anyone who’s ever used the mod_rewrite approach, so I’ll only go over some of the glaring differences.
First things first, only the first rule in a ModSecurity brand rule chain may declare disruptive
actions, and pretty much all meaningful actions are disruptive.
This translates into immediate confusion for the uninitiated reader, as it looks like you’re telling ModSecurity to drop the connection right there on the first condition. Rest assured, this is not the case so long as a chain action sits beside it.
Nobody ever said it was a good language grammar, just a powerful one.
The regex syntax is also a little different, but it should be fairly obvious to most people what’s going on—bangs negate the match and all that fun stuff. The @rx operator is not strictly necessary in this particular case, but disambiguation is always good in my book.
What’s really cool is that ModSecurity offers a faster alternative to regular expression parsing wherever you might have a lengthy list of straightforward options known as a Phrase Match;
indicated here in the second condition by the @pm operator. So long as your matching needs are simple, you can shove hundreds or even thousands of entries in here (or into a separate file) with relatively little performance degradation. Phrase matches are also case insensitive, which is nice.
But back to that first rule and its disruptive actions (which won’t actually fire off until the last rule in the chain matches against the request): That funky looking phase:1 declaration means that the request will be processed by this rule chain as soon as the request hits our server (i.e. in the first phase of the request). If it matches, nothing, not even a response code, goes back to the client as the TCP connection is torn down, and Apache simply pretends that the request never even happened from that point forward. Now we’re finally, actually conserving our precious resources!
Did I mention the logging is also much better? Because it is. tail -f one the next time you catch a dirty little leech and see for yourself. Tracking down false-positives is so much easier with ModSecurity that it’s really not even funny. This makes it a great way to block content-thieving bots, too.
Whatever you do, don’t make the mistake of thinking this preschool crap is all ModSecurity is good for (or that I’m the world’s foremost expert on the subject!), I just find it easier to start learning new things by re-implementing old things, and this is one old thing that’s always left a bad taste in my mouth.
Like mod_rewrite, this little module is a veritable powerhouse, sporting more features than any one person is ever likely to need all at once, and so it is well worth further study. No longer is your entire server at the mercy of whatever Bangalorean trade school dropout wrote that horrible WordPress plugin you just couldn’t live without. Need to verify an auth token inserted into your AJAX headers? Easy. Want to keep that constant stream of favicon.ico 404s out of your logs? Easy. Need to make sure no XSS attack ever gets anyone anywhere? That’s right: easy. Read the docs, buy the books that aren’t so hideous, and dig in.