Well, there's a couple of pieces of sample code on comp.arch in google groups / usenet. Sadly, I can't find any examples of genAsm anywhere, but a couple of fragments of conAsm. Maybe that's better, as conAsm is what's going to actually run.
Higher-end Mill models have more and bigger things to encode, and so a given program will occupy more bytes than it will on a lower-end member. Thus a belt reference on a Gold is 6 bits, but only 3 on a Tin.
Mind you, the Mill's split-stream instruction decoding means each decoder only has to deal with ~168 bytes and ~16 instructions, and there's a cache for both (I think?).
This tread for example, talks about instruction density of Mill programs. Mill does somewhat poorly, with a pre-alpha quality compiler doing the codegen; https://groups.google.com/forum/#!topic/comp.arch/RY3Bk7O61u...
Comparing several ISAs on the simple program in the first post, a Mill Gold CPU binary weighs in at 337 bytes (and 33 instructions), compared to:
* powerpc: 212 bytes, 53 instructions * aarch64: 204 bytes, 51 instructions * arm: 176 bytes, 44 instructions * x86_64: 135 bytes, 49 instructions * i386: 130 bytes, 55 instructions * thumb2: 120 bytes, 51 instructions * thumb: 112 bytes, 56 instructions