Databases: The Next Generation--Notes

One of the DrupalCon presentations that I wanted to be sure to watch early on was Databases: The Next Generation, presented by database maintainer and self described gadfly Larry Garfield.

There have been some big changes in the Database Abstraction layer in Drupal 7, which I've been using in patches to core and porting a contib module or two to D7. I've already groked dynamic queries and why they are great from reading up on PDO, but since most of my database experience has been using Drupal 6, I still haven't quite figured out how transactions work even though they've been around in databases for a long time ....and I have no idea about how to make master/slave work.

Even though, as Garfield says, everything in this presentation is covered in the docs at http://drupal.org/node/310069, this presentation is a great and worthwhile intro to the D7 db abstraction layer.


There are only 5 lines of code that didn't change, and it's now completely object oriented under the hood. Can now do DB specific optimization in the driver.

Type-handling is handled automatically, which is cool. There is now a driver for Oracle and MS SQL, which means real support for 5 databases (amazing!)

The database abstraction now uses a fluent API like jQuery. Garfield said that it should be familiar "which is good". I don't know if they were considering usability and how it is good to work with Web developer's mental model of how things work, but I'm guessing yes.

[8:00] The configuration (within Drupal) for a master/slave setup seems pretty simple... though I am not a server geek, so I don't know what all else goes into setting it up. Core specifies which kinds of queries can be directed at the slave servers, contrib modules can do that, but sounds like it could be complicated. 

 There is a flag in the session that will disable slave server access if something has changed. For instance, if you post a node, you won't access the slave for a certain period of time to ensure that you see the node. The rest of the world won't see the node until it has replicated to the slave server. Core does this for node save and comment save.

[11:20] Now a whole bunch of queries:

Queries

Static (select)—db_query

Majority case, for queries that don't really change. The only change is you now use a named placeholder instead of '%s' or '%d', etc. No quotes because we're using prepared statements. Basically makes a template and drops values in like function parameters. Do it this way and SQL injection doesn't happen. (Sidenote: benchmarks show this is about even with the old system).

[19:00] You can pass in options to db_query, most likely the only two to use are 'target' (if you want to direct to the slave) and 'fetch' (to change the fetch mode to associative array, numeric array, an array keyed by both, or a instances of a certain class processed in C code before returned). A query is slave safe if you don't care if the data in it is 2-3 mins old. If there is no slave server, target is ignored.

Lots of ways to fetch data now, which has simplified lots of places in core where we were looping for no purpose. Now it is shoved all down into DB layer.

[25:45] Because all of these methods return objects, you can chain just the way you do in jQuery and achieve things in one line of code. For instance, db_query("SELECT nid, title FROM {node}")->featchAllKeyed(); (This received a round of applause)

fetchObject and fetchAll are the only ones that happen in C code, but it is better because code is consolidated.

Dynamic queries—db_select

Don't want to use this unless you need it.

[28:00] This gives us hook_query_alter. Builds queries as a data structure. You can add a tag to your query and then it will get passed to hook_query_alter.
join() and addField() can't be chained because they could return something besides the object.
 
You should always add the node access tag if you are running against the node table for security reasons. Extenders are available, for instance extend('PagerDefault').

db_insert, db_update

[42:00] Allows multi-insert, which is faster than the old way.

You can also insert from a query.

db_merge

[47:30] Counter increment

Transactions

Several queries that run together, if one fails they all fail. Drupal has a wrapper around PDO's transactions (partially to hide that they don't work with MyISAM tables).

All you do is create an transaction object, catch an exception and rollback the transaction (with a message to the watchdog).

Docs

drupal.org/node/310069