Friday, August 28, 2015

Statistics queries with time range for PostgreSQL and Hibernate

Few days ago I was asked to implement charts with some finance data using JS charting library, in the application working on Postgres database. There's a lot of finance stuff in the database stored every minute and I just needed to show this data on charts.

OK, but charts are nice if you can restrict amount of your data returned from the server to some limited number of items. Returning 1000 items would both - make chart very heavy and unreadable and consume a lot of net traffic. I assumed 50 as the average number of items that are nicely displayed on such charts and are also acceptable from the traffic point of view.

It would be nice to split time data to some periods that always assert no more than 50 items and then return only such statistics to the client side. And here I discovered very nice and applicable postgres function, which is date_trunc.

Equipped with this weapon I figured out universal solution to calculate such time-based statistics for postgres database using Hibernate. Firstly I designed a very simple enum to calculate right time-grouping criteria:

It'd be nice then to have everything working with whatever entity type and to be able to customize grouping query with Hibernate criteria, because this is right abstraction to build SQL query step by step from filters hypothetically associated with chart filters:

It's also quite simple to autodetect best data granulation for the chart (eg. if it is a hour, day, month or something else) depending on the base time range for the chart.