While working on Excellent Schools Detroit’s school data API, I ran into an intersting set of problems relating to stacking database queries on top of one another.
To illustrate: ESD’s education data is made of a base entity (say, a school or an early childhood center) and data related to that entity (a school’s ACT score, an early childhood center’s state rating set, or a “profile” that can be edited by a school administrator). The scorecard allows parents to sort & filter schools and early childhood centers by all sorts of parameters. Many of these filters involve joining against the same table. We don’t want to join the same table twice, so I needed to figure out a way to add a join to the query only if it wasn’t already added.
In the case of this specific project, I don’t need a generic “de-duplicate any potential JOIN” system – the queries that I’m worried about are complex enough that it doesn’t make sense to come up with a comprehensive test & optimization. The Views module de-duplicates joins, interestingly enough; unfortunately it uses a very cool method get_join_data
that only exists in views query land, and not in db_select
. I settled on using SelectQuery’s tags & dynamic query alteration, with a bit of special sauce.
Drupal’s ORM offers a system to add a tag to cetain queries (say, has_arts
), and in turn, apply alterations to them (via a function hook_query_has_arts_alter
). In preExecute, the tags are processed in a foreach loop & the hook_TAG_alter
s are applied. This means that a tag added during a hook_TAG_alter
is added to the query’s alterTags
property but its hook_TAG_alter
is not invoked. In an ideal world, SelectQuery would allow tags to be added during query alteration. I don’t really know how to make that work, though, so I settled on doing something like this:
if (!$query->hasTag('do_whatever_join')) {
modulename_query_do_whatever_join_alter($query);
$query->addTag('do_whatever_join');
}
This is dumb, but it gets the job done. I’d like to come up with a cleaner way of handling this, but I haven’t found many references for de-duplicating JOINs in an ORM. I dug up this paper, along with the Views module I linked to above… that’s it, so far. Drupal 8 may eventually offer a hasJoin
method, which would make much more sense than using tags.