detriot.org

Benjamin Chodoroff's blog

While working on Excellent Schools Detroit’s school data API, I ran into an intersting set of problems relating to stacking database queries on top of one another.

To illustrate: ESD’s education data is made of a base entity (say, a school or an early childhood center) and data related to that entity (a school’s ACT score, an early childhood center’s state rating set, or a “profile” that can be edited by a school administrator). The scorecard allows parents to sort & filter schools and early childhood centers by all sorts of parameters. Many of these filters involve joining against the same table. We don’t want to join the same table twice, so I needed to figure out a way to add a join to the query only if it wasn’t already added.

In the case of this specific project, I don’t need a generic “de-duplicate any potential JOIN” system – the queries that I’m worried about are complex enough that it doesn’t make sense to come up with a comprehensive test & optimization. The Views module de-duplicates joins, interestingly enough; unfortunately it uses a very cool method get_join_data that only exists in views query land, and not in db_select. I settled on using SelectQuery’s tags & dynamic query alteration, with a bit of special sauce.

Drupal’s ORM offers a system to add a tag to cetain queries (say, has_arts), and in turn, apply alterations to them (via a function hook_query_has_arts_alter). In preExecute, the tags are processed in a foreach loop & the hook_TAG_alters are applied. This means that a tag added during a hook_TAG_alter is added to the query’s alterTags property but its hook_TAG_alter is not invoked. In an ideal world, SelectQuery would allow tags to be added during query alteration. I don’t really know how to make that work, though, so I settled on doing something like this:

if (!$query->hasTag('do_whatever_join')) {
  modulename_query_do_whatever_join_alter($query);
  $query->addTag('do_whatever_join');
}

This is dumb, but it gets the job done. I’d like to come up with a cleaner way of handling this, but I haven’t found many references for de-duplicating JOINs in an ORM. I dug up this paper, along with the Views module I linked to above… that’s it, so far. Drupal 8 may eventually offer a hasJoin method, which would make much more sense than using tags.


Thanks for reading. Email me at ben at falafelcopter dot com. Read my posts here.