Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Realtime Data Processing at Facebook (muratbuffalo.blogspot.com)
124 points by mad44 on July 9, 2016 | hide | past | favorite | 7 comments


If you're looking to do something similar at your company, I work at Interana[1], and we're aiming to provide near-real-time insight into analytics; Both slice-and-dice real-time queries and session-stitched funnels.

The founders were the guys who built Scuba, and we're taking a somewhat different approach (mostly driven by differences in scale). We're not quite at the second scale delivery times, and are based on more classical logfile rotation and aggregation mechanisms to get our raw data, and then an efficient sharding layer to get it into our columnstore.

[1] http://www.interana.com/


How many applications are written embedding or extending your tool? Understand the Scuba comparison and very valid points you make but the core of the paper seems to be about writing applications that can make real time decisions like fraud, ad analytics, Sensor alerts, etc

AFAIK your tool cannot seem to identify trending events as they are streamed in (like moving standard deviation for example) and feed downstream to a pipeline unless I am mistaken



Json happens to be the self-describing interchange format that's well known and generally accepted. It's the payload of most tracking cookies (which are the primary type of data we ingest); You don't need any transformation on the ingest tier; Just POST, validate, dump to logfile.

We also support CSV and apache logs, but JSON is what works for customers.


Sort of unfortunate codenames. Since scribe is a company that makes message bus software in the Microsoft world, and swift is, well you know..


Scribe was open sourced in 2008... It's not a new project.

https://en.m.wikipedia.org/wiki/Scribe_(log_server)


and Puma is a rails server




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: