Question for guys working on any data analytics projects - local development and working on features when the database we touch is huge. For some time now I have the problem of working with huge datasets in relational databases where having a local copy of such DB is simply not possible. I have a way of working around the problem by utilising either a mix of local editor and remote iPython or having some mocked data. That of course causes some problems with performance, since you can quite easily write slow code. Any ideas?
TL;DR: How to develop code when you need a huge database for work (more than 40GB)?
If the size of the data set is essential to your development, then could you set up a development database server on dedicated machine (could be an old workstation/laptop) on the local network that could act as common infrastructure for the whole team? This would only work if your application supports connection to a remote database, but it would solve the problem of externalizing the need for a database and sharing that resource among your development team rather than having each of you running a big DB locally.