Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been struggling with a similar query recently perhaps better suited for a time series DB. Can DISTINCT ON be used to ask the query "latest record per customer per day". For example, let's say we have a bunch of devices that send sensor readings arbitrarily throughout each day. Could DISTINCT ON be used to query the last reading of each day per device?


Also consider expressin this with a lateral join.

Scroll a bit down here for a similar example.

https://stackoverflow.com/questions/11472790/postgres-analog...


Absolutely. You'll need an expression to get the date out of the timestamp, then you want rows that are distinct on customer_id & date order by timestamp desc.


I've been using Timescale DB, a postgres extension, to store our sensor-produced timeseries data and it handles this pretty well. DISTINCT ON seems to ruin the query plan there, by forcing sequential, non-index scans of each subtable (timescale breaks tables into time-buckets behind the scenes, so it can skip searching tables in time-filtered queries). I've had good performance with their "time_bucket" and "last" functions, though, and not having to use some NoSQL time series database has been really nice.


Yes, you can `DISTINCT ON` with a `DATE_TRUNC` expression which should get you what you want.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: