Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

An incredible amount of software and infrastructure is written precisely for analytics data gathering workloads.

I'm pretty confident AWS's product for this use case would be Lambda and the new on-demand DynamoDB.

Is there actually a use case in analytics that requires a server that accepts connections from multiple clients, and then has to have <60ms latency including state over the wire and executing sophisticated business logic, between those clients, for time periods longer than 5 seconds? I.e. something that resembles a video game?

Because if there isn't, if your goal is to scale, why have containers at all?



Im under the impression that lambda gets expensive if you have many requests. E.g. the story on ipify that showed 1000s of USD on lambda vs 100s on Heroku:

“Today the service runs for between $150/mo and $200/mo depending on burst traffic and other factors. If I factor that into my calculator (assuming a $200/mo spend), ipify is able to service each request for total cost of $0.000000007. That’s astoundingly low.

If you compare that to the expected cost of running the same service on a serverless or functions-as-a-service provider, ipify would cost thousands of dollars per month.”


I'm sure that's true, but you have to be churning through quite a lot of requests to get there.

Sadly so, because I can't stand lambda!


Batching incoming requests, for one. Kinesis only allows 5 write requests per second per shard, for example. As well, Lambda have limits regarding concurrent executions and are very slow (10s) if needing VPC connectivity (in this case the default concurrent lambda limit is 350 due to ENIs)


Hmm... I don't see anything in the docs implying that - Kinesis API docs say it's possible to ingest 1000 records or 1MB per shard per second. There's a 5/s limit on reads however but those deal with batches of records anyway.

We have one service running that consumes data to a Kinesis stream published as an API GW endpoint. Preprocessing is done in Lambda in batches of 100 records and the processed records get pushed to Firehose streams for batched loads to a Redshift cluster for analytics. So far we've been very happy with the solution - very little custom code, almost no ops required and it performs and scales well.


Yes, sorry, i confused it with GetRercords limit.


I suppose you could just ask them to raise the limits?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: