Google+ Followers

donderdag 30 januari 2014

Why I like PostgreSQL more than MySQL: Windowing

When you deal with customers in a shop you will want to log changes to your stock and your orders, both to generate statistics and to have a way of tracing any problems. The simplest way to do this is through a trigger that simply creates a copy of the entire record in a separate auditing table.

MySQL can do this just as well as PostgreSQL, but analysing the data is a different story.

Whenever you look a a history of events one of the first things you will want to know is: how much time elapsed between each row? Traditionally you would solve that by adding  a self-join to fetch the previous record, and subtracting the two timestamps.

Windowing allows you to do it much faster, in a much more flexible way that is infinately more readable:

timestampfield - LAG(timestampfield) OVER (ORDER BY timestampfield)
FROM tale;

Yes, it really is that simple. LAG() fetches the previous record and the OVER(ORDER BY ...) indicates that "previous" should be taken from "the record with the highest timestamp that is still lower than the current record's timestamp".

Windowing also allows you to fetch the "second to previous" record, or "the ten previous records", and obviously you can do the same with records following the current record, using LEAD().

There are many other things you can do with windowing, like subdivide the data into groups of equal values in a particular field and ranking.

See also: