which is a bit inefficient and we try to go hosts -> kafka when we can but a lot of stuff only supports rsyslog so it's there and simple enough.
The rsyslog collector and our crunchers are teeeny tiny compared to the rest of the pipeline and can chew through up to a week of backlog in a few hours. The bottleneck is the network for us and if we upped the pipe to 10G we could probably get away with a single host.
hosts -> rsyslog collector -> kafka -> custom crunching -> elastic
which is a bit inefficient and we try to go hosts -> kafka when we can but a lot of stuff only supports rsyslog so it's there and simple enough.
The rsyslog collector and our crunchers are teeeny tiny compared to the rest of the pipeline and can chew through up to a week of backlog in a few hours. The bottleneck is the network for us and if we upped the pipe to 10G we could probably get away with a single host.