Saturday, July 1, 2006

Where is Group By In XQuery?

Where is Group By In XQuery to sort by apples and pears?XQuery has been designed and is used today as a language to query XML data sources, including XML databases. So you would expect that XQuery offers the same level of expressiveness for XML data that SQL offers for relational data.

A quick glace at the language would make one think that this the case. But as I was writing some XQuery for a client last week, I stumbled upon a problem: where is group by in XQuery?

When looking at different types of queries sent to a database, you can make a distinction between queries that retrieve some piece of information and queries that aggregate and analyze the data stored in the database. This is the distinction between getting the detailed information about a specific invoice, or the list of clients located in California, and getting the average raise employees in the company got last year per department, or top 5 selling products by product family. You can easily write those two last queries in SQL with a group by. It is possible but harder and more verbose to write the equivalent in XQuery.

In fact, the problem has been recognized by Vinayak Borkar from BEA, and he suggested an extension to the XQuery syntax to solve this issue in his paper Extending XQuery for Grouping, Duplicate Elimination, and Outer Joins. Hopefully Vinayak's suggestion or a similar extension will be added to the next version of XQuery. In the meantime database vendors will maybe choose to implement extensions to XQuery for that purpose.

1 comment:

  1. Group by is planned for XQuery 1.1 (see #2.3.1)
    IBM is going to make a proposal that is largely compatible with BEA's.