Well, there's a couple of pieces of sample code on comp.arch in google groups / ...

igodard · on Jan 12, 2017

GenAsm for the factorial function (conAsm in a different reply, above): define external function @fact w (in w) locals($3, %9, &0, ^6) {

label $0:

    %1 = sub(%0, 1) ^0;

    %2 = gtrsb(%1, 1) ^1;

    br(%2, $1, $2);

label $1 dominators($0) predecessors($0, $1): // loop=$1 header

    %3 = phi(w, %5, $1, %0, $0);

    %4 = phi(w, %6, $1, %1, $0);

    %5 = mul(%3, %4) ^2;

    %6 = sub(%4, 1) ^3;

    %7 = gtrsb(%6, 1) ^4;

    br(%7, $1, $2);

label $2 dominators($0) predecessors($0, $1):

    %8 = phi(w, %0, $0, %5, $1);

    retn(%8) ^5;

};

Higher-end Mill models have more and bigger things to encode, and so a given program will occupy more bytes than it will on a lower-end member. Thus a belt reference on a Gold is 6 bits, but only 3 on a Tin.

TazeTSchnitzel · on Jan 12, 2017

Mind you, the Mill's split-stream instruction decoding means each decoder only has to deal with ~168 bytes and ~16 instructions, and there's a cache for both (I think?).