Race conditions and caching variables

I would like to claim an utter hatred of race conditions. This is where code is written in such a way that it doesn’t fully consider the possibility of another thread (e.g. another website hit) or threads occurring concurrently. Consider the following which has been increasingly frustrating me recently:

Drupal stores variables in the ‘variables’ table. It also caches these in the ‘cache’ table so that rather than doing multiple SELECT queries for multiple variables, it simple gets all the variables straight out of the cache table in one SELECT then unserializes them.

cron_semaphore is one of these variables which is created when cron starts, then it deletes it when finishing. If it isn’t deleted it should mean that cron hasn’t finished running yet, so the next time cron tries to run it will quit straight away. But due to a certain race condition it doesn’t always get properly deleted as follows (p2 is an abbreviation for an unrelated process running concurrently, e.g. a visitor to your website):

1, cron starts, cron_semaphore variable inserted (and variables cache is deleted)
2. p2 starts, variables cache is empty so “SELECT * FROM {variables}” then…
3. cron finishes, cron_semaphore variable deleted and the variables cache is cleared
4. … p2 inserts result of “SELECT * FROM {variables}” into cache, but that SELECT was called before cron deleted the variable
5. you now have no mention of cron_semaphore in the variables table, but there it is back in the variables cache!

Consider many visits to your website concurrently and you soon realise this can become a very common occurrence. In fact, this exact problem inflicts us at least a handful of times every day. As a result cron keeps trying to run but immediately quits when it sees the semaphore variable still there. After an hour it finally deletes the semaphore but in the meantime crucial stuff doesn’t get done.

Web applications can quickly become riddled with race conditions such as these. I’ve spotted at least two more in Drupal’s core code in the past. When the ‘bugs’ occur as a result they can be tricky to pin down, appearing to be freakish random occurrences. Worse yet, even when found they can be a royal pain to untangle.

Disabling all Drupal CSS files

We needed to disable all of Drupal’s CSS files from our theme. Here’s how we did it:

function THEMENAME_preprocess(&$variables) {

  // Get rid of all of Drupal's CSS files
  $css = drupal_add_css();
  foreach ($css['all']['module'] as $file => $status) {
    if (!strstr($file, 'modules/MYMODULE')) {
      unset($css['all']['module'][$file]);
    }
  }
  $variables['styles'] = drupal_get_css($css);

We also wanted (no *real* need) to use screen.css rather than style.css, so we edited THEMENAME.info to have this:

stylesheets[all][] = reset.css
stylesheets[all][] = screen.css

… and we removed the line for style.css from it.

Finally, as the superuser we went to /admin/build/modules (or /themes, can’t remember now) to refresh the theme cache. We also had to tick to enable the theme at /admin/build/themes as although we’d been using the theme for ages quite fine, it wasn’t actually ticked before.

And hey presto, it worked. Should probably add that it took waaaay too long to do though, so though we’d add this snipped for others to read.

Using less Drupal

Lately, I’ve had various frustrations with Drupal which have moved me away from using it for various things. I’d like to go through where I’ve moved away from Drupal, why I’ve made those changes, and my future Drupal decisions.

WordPress rather than Drupal blog

To begin with, this blog is now on WordPress rather than Drupal – and I have to say that I’m loving it… and so are my non-geeky colleagues. It ticks all the right boxes. Its *really* user friendly. Its much easier to add photos (and videos) to posts. And it’s hardly taken long at all to setup.

So where did Drupal go wrong with this? Well, I guess its the ‘kitchen sink’ approach back-firing. In trying to be something to everyone, Drupal runs the risk of being less than perfect to to any one specific task too. WordPress, on the other hand, has the ability to focus on being the very best blogging software out there, and nothing else gets in the way of that or deters it from its ultimate goal.

But also, we needed to seaparate the business end of our website (the creation of yearbooks, using tens of thousands of nodes) from the nodes and users to do with the blog, partly because around this time every year we flush out the old site and start a new. So we were either going to have a separate Drupal install for the blog or use WordPress. We chose WordPress.

Form theming frustrations

Have a look at the form here and let me know how you’d create it using FAPI (we’re on Drupal 6.4). I’m talking specifically about the theming of the form. Yes, its a very simple form. A search form. But let’s have a look at what’s going on with it and discuss the Drupal way versus the way we ended up doing it.

In FAPI, you’d probably have a ‘textfield’ element for the search box and a ‘submit’ element for the ‘Go’ button. Easy enough, one minute of code. But what about the title ‘Name of your school or group’? Probably a title for the textfield element, no? But then how do we get it centred above both the textfield and the submit button? And what about the text under the two fields? A description, right? Again… how do we get it to appear *exactly* where we want it? The look and feel of the form are to me absolutely crucial. I don’t just want a textfield’s title (with annoying colon after it), a textfield, a description, and then a Go button all one on top of the other.

The Drupal solution? Using a theme hook. We define the form in our implementation of hook_form() and then theme that form separately. The ‘programmer’ cares about the functionality of the form but not the visual design of the form. The ‘designer’ doesn’t care about the functionality and instead works on how it looks. But I’m both the programmer and the designer here, and I want my work to be as easy as possible! So, let’s say i go down this route (I tried, I really did). I need to register my theme function in hook_themes(). Okay, I know why, it saves extra code running on every page load. But its still annoying. Now I create my theme function… urgh. You’ve got to really know your FAPI stuff to get this to work. I try for a while but then I give up. It just feels so messy with some code somewhere, some in another place, and then when I ask my colleague to have a look so he can learn how to do it he’s disgusted… doesn’t know what’s going on… starts bad-mouthing Drupal. So we build our own massively simple FAPI instead in about half an hour that does just what we want it to do.

So now we’re using our own super-basic FAPI for this form. Not all forms, just the ones we want complete control over, visually. Rather than using hook_form() and defining a form array, we just hard-code the HTML for the form. Some of you may be in complete horror now thinking about this but its just by far and away the easiest way to get forms to do exactly what you want. Like a forename and surname field side by side rather than one on top of the other, sharing the title ‘Name’ which is a label for the forename field.

We’re sticking with the idea of a validate() and submit() hook though. I like that one. But we’re doing it slighly differently and more simply, so that any new coders we might hire can quickly and easily pick it up.

Going nodeless

I don’t always like nodes. I really, really don’t. I don’t need revisions, and if I did I’d do them in my own way, just the right way for me, rather than a way that kind of works for everyone but not quite perfectly for anyone in specific. I don’t like the way that as uid=1 I get all the extra bits like ‘Authoring information’ which I never touch. Node hooks and the nodeapi which I once loved are now a higgeldy-piggledy mess that’s a real pain for my new hire. I try to explain to him what’s going on when I save a node. “So this function deals with the submission. But not the core node stuff, Drupal deals with that. And if we want something used for all nodes, we put it in here instead. And we can also override this specific bit here.”… he looks on in amazement, totally baffled by what’s going on and why. It would be so much easier for him (and me) to understand if everything’s just in one place.

So what do we gain from nodes? Umm… not much really. We don’t use contributed modules any more because they never do exactly what I want and always do stuff which I don’t want them doing which just make them less efficient. We put all our code in our own one module instead. A massive, hefty module with a dozen or so include files.

We gain the ability to always do $node->nid and use node_save() and other handy things. But we don’t really need nodes, and it frustrates me having to do the extra INNER JOINs on node_revisions etc. So we’re trialing not using nodes at all for one of our content types – our customers. We just have a simple ‘id’ field now in one single table. We no longer need to INNER JOIN node and node_revisions. We haven’t had any problems so far, but the new hire is finding it much easier to code now, without the ‘baggage’ of Drupal.

The future

Our current plan is to gently migrate away from Drupal, perhaps altogether. We like the idea of building our own framework again, one that does exactly what it needs to do for our site. Its not something we can do overnight. Ours is a yearly cycle, following the academic year, and the current plan is to fork the codebase in around January/February and that would mark the beginning of our own framework if we still feel that way then.

In the meantime, we’ll continue using nodes for most of our content types (if simply because migrating away from them would be a long and arduos task with little reward) and we’ll continue to use FAPI for most our forms. But I see us using our own simple FAPI for more and more forms where we need complete control over them, and I see us extending this FAPI to help us reduce using the same code multiiple times.

I think I still like Drupal. I definitely appreciate the vibrant community. But sometimes I thoroughly hate Drupal and get massively frustrated by it. But I still like it in theory at least. One framework for all my websites. But whilst I just work on one massive website it just has so much less use to me.

You’re more than welcome to urge me to stay with Drupal. In fact, I highly hope someone can manage this. I’ve put a lot of time over the last few years into learning Drupal, and it would be a great waste and a shame to lose all that.

Debug live errors more easily with debug_backtrace() output in error messages

Tailing my error log, I kept coming across annoying errors like this:


[Mon Nov 13 21:14:49 2006] [error] [client xx.xxx.xxx.xxx] PHP Warning: mysql_real_escape_string() expects parameter 1 to be string, array given in /path/to/drupal/includes/database.mysql.inc on line 350, referer: http://www.example.com/node/1234/edit

No matter how hard I tried, I couldn’t reproduce the errors locally, but somehow real users could create them on the live server. I tried to track down the bug but could only go so far – yes, it was happening when a node was edited, and it was in a database query, but which query? There were too many to look at, so I needed more information.

Because I couldn’t reproduce the bug locally, no amount of dprint_r() and such had any effect. Following some advice in #drupal I explored debug_backtrace() and added its output to a custom error log like this:


function mymodule_error($errno, $errstr, $errfile, $errline, $errcontext) {

if (in_array($errno, array(ERR_LOW, E_NOTICE, E_STRICT))) {
$level = false;
}
elseif (in_array($errno, array(ERR_MED, E_WARNING))) {
$level = ‘WARNING’;
}
elseif (in_array($errno, array(ERR_HI, E_ERROR))) {
$level = ‘FATAL’;
}
else $level = ‘UNKNOWN’;

if ($level) {
// don’t log unimportant errors

$functions = array();
foreach (debug_backtrace() AS $errors) {
$functions[] = $errors[‘function’] . ‘:’ . $errors[‘line’];
}
$trace = join(‘ < = ', $functions);

error_log($level . ' ' . $errfile . ':' . $errline . '. ' . $errstr . '. (' . $trace . ')');
}
}
set_error_handler('mymodule_error');

I put the code in a module file, and it didn’t work… an hour of hair pulling later I found the culprit to be devel.module:

function devel_init() {
restore_error_handler();
}

… this was undoing my error handling setting, so I had to comment out the restore_error_handler() line and my error handler took centre stage.

With the new code live, I just waited a few hours until the bug showed up again – this time, with extra backtrace information to help me debug:


[Tue Nov 14 18:13:14 2006] [error] [client xx.xxx.xxx.xxx] WARNING /path/to/drupal/includes/database.mysql.inc:350. mysql_real_escape_string() expects parameter 1 to be string, array given. (mymodule_error: < = mysql_real_escape_string:350 <= db_escape_string:152 <= _db_query_callback: <= preg_replace_callback:196 <= db_query:1040 <= my_custom_function:659 <= my_nodetype_update:297 <= node_invoke:494 <= node_save:1876 <= node_form_submit: <= call_user_func_array:206 <= drupal_submit_form:132 <= drupal_get_form:1620 <= node_form:2093 <= node_page: <= call_user_func_array:418 <= menu_execute_active_handler:15), referer: http://www.example.com/node/1234/edit

So now I knew exactly where the bug was coming from. ‘db_query’ was called on line ‘1040’ of my file, and of the several ways to get into the function in which line 1040 resides, I knew it had come form my_custom_function on line 659.

Armed with this extra information, tracking down the bug was a breeze. Stupidly, I’d forgotten to do a regular expression I was supposed to do on a variable before doing a db_query, meaning arrays which were supposed to be left alone were wandering into the query pretending to be strings.

Drupal Community Yearbook

In our effort to give something back to Drupal (the fantastic open source content management framework at the core of AllYearbooks), we’re aiming to make a physical yearbook for members of the worldwide Drupal community.

Presuming all goes to plan and enough people join, the idea is to produce physical copies of the yearbook (as well as allowing PDF downloads). These will be provided free to Dries and the top 10 Drupal contributors. It may also be possible for other interested members to buy a copy, though this book isn’t being run as a money-making venture.

If you’re a Drupal developer or involved in Drupal in some other way, join the yearbook now. It only takes a few seconds to join, and you can come back any time before printing to update your yearbook entry and upload some gallery photos for collages and other such pages.

So far 8 Drupalites have joined the yearbook. We’re hoping for at least 100 members for it to be worth printing.

If you have any suggestions, please get in touch.