Code / The Appnel Group 

Posted
24 September 2007 @ 2pm

Search Throttling and Shared Hosting Environments

Recent on the mt-dev mailing list, Su asked “Does ThrottleSeconds even work in 3.35?” He noted that he was experiencing a lot of “search requests coming in” in 5 second intervals. He set the ThrottleSeconds directive in the systems configuration file, but it wasn’t stopping the bogus search requests.

Not being about to turn down forensic challenges involving the MT code and since we all owe Su a lot for his community efforts I dug in and replied:

It looks like there are two places were the search throttle that could die silently that you’ll want to check into.

1) if DB_File is not installed on that system.

This module is part of the standard Perl distribution since forever from what I recall so its unlikely. I suppose it’s possible that the underlying compiled libraries got hosed that this may make it fail, but I think its less likely then…

2) Your system cannot write to the (configured?) temp directory.

For reasons in which I am unsure, the search throttle uses good old BerkeleyDB regardless of the database you are using. In order to do that it must write a file. With no file path to work with it defaults to the little used configuration directive TempDir which defines the system temporary files directory. When (typically) undefined it defaults to the unix standard /tmp/. You would have to test if you can write to that directory. I have experienced systems where a script cannot write to a directory outside of the web root. The default /tmp/ directory would certainly be outside of a web root in which case mt-search would not be able to write any throttling information and would let the search continue without a warning, error or even a pause.

BTW: One other flaw that occurred to me (though unrelated to your problem) as I looked into this was that you would periodically lose all your throttle information as anything in /tmp/ is subject to periodic deletion. How often depends on the system admin.

Su replied that reasons 2 was an interesting possibility. A quick check of his configuration and server tmp directory turned up a throttle.db the hadn’t been updated in months and was owned by another user account.

With Su’s reply it dawned on me that he was on a shared hosting server with multiple users running MT. The problem is that MT always uses throttle.db as the name of the database and that tmp is shared by the entire system. This means, unless the TempDir is changed from its default to some account specific, only the first user on a shared hosting server to enable search throttling can use it. The others are locked out by Unix permissions. Making matters worse is that MT does not complain about this, it just carries on with the search.

While Su was working with a MT 3.35 system, he noted that this same mechanism is in the MT 4.01 code.

Chad Everett joined in on the thread to note:

Finally, I looked at Search.pm and sure enough, tag searches don’t throttle. What is the logic for this? Yes, a tag search is more efficient than a regular one - at least judging by speed. But if the idea is to protect resources, surely both ought to be throttled (or implement a TagThrottle configuration directive or something).

I have few insights on tag search implementations and replied:

Yes I saw that in the code. Tag search is not subject to throttling. As I recall this was done because the throttle was betting tripped to often by people just casually browsing archives by tags. A good tag search IMHO should be done via mouse clicks and little else. This gives the impression to the user that they are browsing pages rather then searching. I think this is a good thing barring this one issue — a user browsing over content by tags is virtually indistinguishable from someone abusing the system.

So as I recall MT 3.3 originally (was released in beta) had throttling on tags, but with enough people complaining that they were being throttled legitimately browsing through their blog’s tags, the logic was thrown in to ignore tag searches to address the immediate complaints.

Honestly I’m not sure what else could have been done. The TagsThrottle directive Chad suggested would have to be incredibly low in my experience using del.icio.us. I commonly fire off a bunch tag searches in a few seconds trying to whittle down results to find what I want.

The key take aways are as follows.

  • If you are running MT on a shared hosting environment and want to want to make use of search throttling, you should add the TempDir directive to your configuration file followed by a full path to a temp directory your account controls.
  • Tag searches are not subject to throttling control.


There are no comments yet. You could be the first!

Leave a Comment

← Before After →