That sounds pretty fantastic - when you say "ad-hoc" do you mean that it's fast enough to be directly queried from a UI - are we talking seconds or minutes for your queries?
What drawbacks have you found with Impala? I've been keeping an eye on it, and also Shark: http://shark.cs.berkeley.edu/
It depends, if you plan to scan our entire data set it could take 30-40 seconds (roughly ~2.8TB), but we have our data partitioned based on a key that makes sense for the kind of data you'd need to populate a web page and these queries are fast enough (< 2 seconds) for aggregations that come in via AJAX.
We haven't yet had a chance to optimize our environment either. For example, our nodes are still running a pretty old version of CentOS, so we have LLVM disabled (which would help a lot for huge batch computations...see http://blog.cloudera.com/blog/2013/02/inside-cloudera-impala...).
Also, our data is stored in RCFile, which is not exactly the most optimized columnar storage format. We're working on a plan to get everything over the new Parquet (http://parquet.io/) columnar format for another boost in performance.
We haven't come across any real drawbacks using Impala as of yet, it fits our needs pretty well.
Disclaimer: I work for Cloudera in their internal Tools Team, we like to dog food our stuff :).
Edit: One drawback of Impala is the lack of UDF support, but this is something that will be coming in a later release.
What drawbacks have you found with Impala? I've been keeping an eye on it, and also Shark: http://shark.cs.berkeley.edu/