CakePHP, nested one to many joins, lookup lists, calculated fields & find E-mail

It's quite easy to do everything that this article is about using a model's query method, however, the Cake devs say that this is a bad idea with good reason. Using query, amongst other lost functionality, you also lose protection against SQL injection attacks that's automatically provided by the Cake framework.

This article attempts to provide an explanation, and methods for avoiding the use of query wherever possible.

Nested one to many joins

The following diagram is a simple example of the kind of table structures that I have had to model in Cake.

  • Each Group has many Users
  • Each User has many Sales
  • Each Sale has many Items
  • Each Sale has one DeliveryMethod
  • Each Item has many Products
  • Each Product has one Style

Cake performs very well when using find on the database to retrieve information such as the contents of the sales table for a given user, however, attempting to retrieve all the product & user details can cause things to go a bit haywire, depending on which model is used to perform the find.

  • User->find won't return products.
  • Product->find will return styles, but not deliverymethods, users or groups.

Increasing recursion to level 2 can resolve these issues, but will also produce a lot of extra database hits.

Using containables

Containable behaviour is activated by adding the following to all models that are part of a find.

var $actsAs = array('Containable');

To perform a query using containables, an array describing all the tables, joins and fields required is passed to the model's find method.

Looking at the original table structure above, I want to fetch a complete list of products sold by users. To achieve my goal, the following code would be needed.

$contain = array(
'Sale' => array(
'fields' => array(
'Sale.id',
'Sale.user_id',
'Sale.deliverymethod_id',
'Sale.date'
),
'Item' => array(
'fields' => array(
'Item.id',
'Item.sale_id',
'Item.product_id',
'Item.cost'
),
'Product' => array(
'fields' => array(
'Product.id',
'Product.style_id',
'Product.description'
),
'Stye' => array(
'fields' => array(
'Style.id',
'Style.stylename'
)
)
)
),
'Deliverymethod' => array(
'fields' => array(
'Deliverymethod.id',
'Deliverymethod.method'
)
)
)
);
$fields = array('User.id', 'User.name');
$this->User->find('all', compact('fields', 'contain'));

The $contain and $fields arrays demonstrate the minimum information that is needed to get containables to work, and you must follow these rules:

  1. All primary and foreign keys for all tables in the containable array must be specified, even if you don't intend to use them.
  2. The fields to be returned by each model have to be specified in the containable array, not the fields array.
  3. The model that the find is executed against is not to be included as part of the $contain array.

The first point is the most important. If you don't include all the primary and foreign keys then Cake will simply not return data from some of the tables, despite executing queries to fetch the data. The reason for this is that Cake uses the primary and foreign keys to establish joins between the models, so if the keys aren't included, Cake won't know how to join the result sets.

Point two is the primary cause for "field not found" types of errors. In the example, the only columns that should be returned by the $fields array are those in the Users model. You may encounter situations where it seems to work, but that will only be by chance, when Cake happens to have executed a query that includes a join to another table. Generally, it's safer if you follow the rule.

Lookup lists

Going back to the example data structure above, the products, styles and deliverymethods tables are actually lookup tables, and as such, they don't join back to the tables that make up the rest of the sales data structure.

Logically it would seem that each Item has one Product, however, when joining a lookup table, it's actually the reverse, each Item belongs to one Product.

<?php
class Product extends AppModel {
var $name = 'Product';
var $actsAs = array('Containable');
var $belongsTo = array('Product');
?>

Calculated fields

Another thing that can be tricky in Cake is using calculated fields, meaning, fields that are executed as part of an SQL statement, without falling back to using the query method.

There's a gotcha to look out for when using calculated fields and containables. If you wanted to concatenate two fields from different tables to make a single returned field, then you have to put the calculated field into the $fields array, not in the $contain array.

The second thing to watch out for is that Cake will attempt to put quotes around your calculated fields. To stop that happening, put the SQL into a set of brackets.

Looking at the containables example above, to concatenate the delivery method and product name, you'd need to change the $fields array.

$fields = array('User.id', 'User.name', '(Product.description + ' ' + Style.stylename) AS product');

Putting the calculated field into the $contain array will sometimes cause a field not found type of error.

So, the rules here are.

  1. All calculated fields should be specified as part of the $fields parameter.
  2. To stop Cake trying to reformat your calculated field, place it in a set of brackets.

Lookup lists as calculated fields

What happens if you, like me, happen to have a generic lookup table that contains lists of strings? In nearly every project I've ever worked on, I've had this situation, and rather than deal with many tables that have just a few entries, I usually create a table that's something like this.

CREATE TABLE `lookups` (
`list` varchar(100) NOT NULL,
`list_id` int(11) NOT NULL,
`value` varchar(200) DEFAULT NULL,
PRIMARY KEY (`list`,`list_id`),
KEY `listgroup` (`list`)

In the example dealing with lookup lists, I gave the code for the products table, but left out delivery methods and styles because they are both stored in the lookups table.

This means that method for dealing with these two lookups has to be different because in addition to a record id, a list has to be specified. I have found that simply adding the list to the conditions array can cause Cake to screw up the SQL, so rather than split these short lists into unique tables, I've opted to simply use a sub-query in calculated field to fetch the value.

$fields = array('User.id', 'User.name', '(SELECT value FROM lookups WHERE lookups.id = Style.style_id AND lookups.list = 'STYLES') AS style');

Because this is a calculated field, the usual rules apply, with a couple of extras.

  1. All calculated fields should be specified as part of the $fields parameter.
  2. To stop Cake trying to reformat your calculated field, place it in a set of brackets.
  3. Sub-query tables must be specified by the actual table name, because they don't have a model.
  4. Tables belonging to models that are referenced in the sub-query must be specified by model name, not table name.

You can see the last two rules applied in the example. The lookups table is referenced by it's real name, whereas, the table we're joining to, styles, is referenced by it's model name, Style, because Style is part of the $contain array.

blog comments powered by Disqus