Some of these issues were solved by Spark. Do agree with the overall point, people shouldn’t be reaching for Hadoop when Postgres would suffice. Indexes are fast. Use them.
Indexes are fast when they're built well and used often. Indexes are expensive (and paid for in triplicate via backup costs) when they are seldom or never used. Sometimes you just need to materialize a table temporarily, which of course you can do in the RDBMS as well, but sometimes the data sources are so scattered (or also ephemeral) that keeping all processing inside the DB system is a stretch.
But perhaps the most compelling justification is based on the DB systems familiarity on the team. Not everyone has the same level of SQL expertise and some of the visualization tools added to MapReduce systems and the source language itself are more familiar to them than the output of an EXPLAIN statement. Especially if the same pipeline is effectively hundreds of lines in SQL.
> people shouldn’t be reaching for Hadoop when Postgres would suffice. Indexes are fast. Use them.
let me try a riskier one: "people should not be reaching for kubernetes when a system administrator would suffice. sysadmins are cheap(er?) than clouds, use them"
it did not come out very well.... I got stuck trying to find what to contrast kube with; all I got in them minutes alloted to comment posting was 'system administrator'. meh.
Everyone wants to use k8s and like 1% need it in any shape or form. It basically acts as a conspiracy between otherwise-redundant ops level people and those running Kubernettes against the companies who have to pay for all this. Just use Fargate, and be done.