I'm also wondering why this kind of intervention is necessary at all. The NoSQL solution we use at work has load based automatic splitting, and I'd have thought (though I haven't confirmed) that this would be an obvious feature to include.
I would speculate that it's a poorly-chosen shard key. MongoDB's built-in sharding uses range-based indexing. If you choose user_id as your shard key, and those are autoincrementing integers, then you're screwed if newer users tend to be more active on average than older ones.
Where I work (Etsy) we keep an index server that maps each user to a shard on an individual basis. There are a number of advantages to it. For example, if one user generated a ton of activity they could in theory be moved to their own server. Approximately random works for the initial assignment.
Flickr works the same way (not by coincidence, since we have several former Flickr engineers on staff).
Sounds like a good system. I've noticed that people tend to do things like shard based on even/odd, and then they realize that they need three databases.
I've never had either problem though... but if I ever need to shard I plan on doing it based on object ID. Then one request can be handled by multiple databases, "for free", increasing both throughput and response time.
Actually for anonymous sharding (without a central index) a consistent hash is about the closest you can get to ideal distribution and flexibility. I haven't looked but I presume that's what mongo uses under the hood for their auto-sharding, too.