Interesting, honestly speaking we have not thought about time series data a lot ...

synthc · on July 26, 2020

For timeseries data I encoded a [entityId,timestamp,attribute] tuple to a big integer, using a order preserving mapping to ensure that the datoms are sorted by the timestamp. This provided the right functionality, for example using seek-datoms we could retreive the datoms with timestamps between some range, but performance was poor. I think a custom index could help a lot here. We also had problems with the database growing to large, and needed to manually shard the database over time.

A datalog equivalent to TimescaleDB (which extends Postgres with timeseries optimiziations and time based table partitioning) would be great.

For client access I tried to define access rules based on attributes (similar to how many graphql frameworks handle this), I tried to express this using datalog rules. For example, users hava permission to access :user/items, and :items/blabla, so a user X can access [X :user/items Y] and [Y :items/blabla Z] Some experiments were promising, but it was slow and I did not find a good way to integrate this.

whilo · on July 27, 2020

I see, so your problem was that you wanted to scan over all Datoms for one entity over a time period and you would have needed to have an EVAT index? In Datahike it would be fairly simple to add new indices like this.

Yes, access management must not incur a large overhead, that is why many systems have a separate restricted way to express and track rules. My hunch is that it still would be better to keep it in Datalog and specialize the query engine that it is fast on these (potentially restricted) rules and relations.