Subtraction *is* comparison at the assembly level though. C sadly doesn't give a...

fubarred · on Dec 26, 2014

Having designed microcoded pipelined processors from gates on up in uni, that's true.

integer cmp x, y often reduces to microcode that effectively runs:

  - 3A: sub x, y, dummy_result (att syntax) <-- most likely^
  - 2A: sub copy_of_x, y (intel syntax)
  - 1A: push x, push y (or y, x), sub, update flags, pop

^ Depends heavily on the microarchitecture, because there's likely all sorts of extra features and ops that can be set in the same microinstruction cycle to implement the macroinstructions.

In signed 2's-complement (mostly everything), zero flag just checks that all result bits to be zero and the less than flag is just the result's MSB (negative result if 1).

Signed and unsigned have their under/overflow issues. Being able to detect that and handle it in code, rather than crashing or silently producing undefined results, can be important, such as safely re-callocing memory (multiply overflow). (There are other issues like pointer/slice index over/underflow too, and these are related problems.)

The issues are usually around whether branch-free code is absolutely necessary or whether predicting branches wrong would incur disastrous pipeline stall/s (greatly depends on the specific use-case, especially inner loops... don't prematurely overoptimize).

I almost feel there needs to be something in-between IR languages like LLVM IR and general-purpose C, that's not assembly but still general-purpose, functional/imperative, static analysis and able to expose math and branching state/options/differences cleanly across a variety platforms without a fugly/verbose syntax.

To make up a pseudoreligion (if it's doesn't look like your preferred Turing-complete language, sorry) at random:

     var x, y int32

     # ... assign x and y somehow here
     begin
       x = x +! y   # or x +=! y
     rescue IntegerOverflowError
       # ... int32.max + int32.max => (0xFFFFFFFF + 0xFFFFFFFF) & 0xFFFFFFFF => 0xFFFFFFFE (-2, int32)
     rescue IntegerUnderflowError
       # ... int32.min + int32.min => (0x7FFFFFFF + 0x7FFFFFFF) & 0xFFFFFFFF => 0xFFFFFFFE (-2, int32)
     end


     var s, t uint64

     # ... assign s and t somehow here
     begin
       s *=! t
     rescue IntegerOverflowError
       # ... uint64.max * uint64.max => ... => 1 (1, uint64)
     end

spc476 · on Dec 26, 2014

In signed 2's compliment architectures (primarily any microprocessor made in the mid 70s or later) that has flags (some RISC based CPUs, like the MIPS, don't have individual flags) will usually have an Overflow flag, (O or V), a Carry flag (C), a Zero flag (Z) and a Sign or Negative flag (S or N---here I'll be using N and V since I have a preference for Motorola CPUs). Yes, internal, a comparison is typically done via a subtraction; the flags are updated, but not the destination register. Now, depending upon how you want to interpret the results, as a signed comparison or unsigned comparison, depends on the flags you check (and the conditional jump (or branch) instructions will do this check for you). The flags that are checked for unsigned comparison are (!f means NOT f, i.e. the flag is not set):

    above or equal: !C
    above:          !C and !Z
    below or equal: C or Z
    below:          C

And the signed comparisons are:

    above or equal: N == V
    above:          N == V and !Z
    below or equal: N != V or Z
    below:          N != V

And, just for completeness, for both signed and unsigned:

    equal:   Z
    unequal: !Z

fubarred · on Dec 28, 2014

Cool. Brings back memories of implementing various jump instructions and carry prediction adders (do adds twice, pick result based on actual carry later) (I recall I had the smallest microprogram in the course class and it used the fewest microcycles by far... everyone hated me. :)

Would it be nice to be able to access the carry in and carry out as "variables", as well as the full imul/idiv if i.e. 32 args -> 64 bit full-precision results without dropping down into assembly. Perhaps it's unreasonable, but it seems there are few limited differences per generic instruction across processors that assuming they are all each "special unique snowflakes" is obviously untrue FUD.

Understanding architecture / assembly comes in handy to replace high-level branching code with branch-free code to avoid pipeline stalls when branch predictions (eg the branch prediction infrastructure) is guessing incorrectly. Also, being able to go down the stack is really helpful because there are most definitely bugs the way down.

userbinator · on Dec 26, 2014

Because that overflag makes all the difference (no pun intended), I wouldn't say that they are the same, but that comparison usually involves subtraction.

I remember coming across a set of lecture slides for a CS course on architecture that used its own toy architecture with neither flags nor separate comparison instructions - saying that comparison is the same as subtraction gives people the entirely wrong idea, and leads to the sort of bugs mentioned in this article.

vardump · on Dec 26, 2014

Yup. On pretty much any architecture I can think of, a CMP is exactly like a SUB that does not store result, but sets only flags.

haberman · on Dec 26, 2014

When something is easy at the hardware level but there is no easy way to express it in programs, it suggests a failure of the abstraction. Or more positively, room for improvement. I'd love to see some way that this can be expressed elegantly in C.

psyklic · on Dec 26, 2014

The way that C expresses it elegantly is via the comparison operator. The author is concerned that people try to be clever and use subtraction instead. Here is how the author corrected this error by using comparison operators instead of subtracting (in the tree man page):

    int
    intcmp(struct node *e1, struct node *e2)
    {
  -	  return (e1->i - e2->i);
  +	  return (e1->i < e2->i ? -1 : e1->i > e2->i);
    }

The ternary operator is necessary, because in C a relational operator is evaluated to 0 (false) or 1 (true). So, the author added a special case for returning a negative value (indicating the first argument should be sorted before the second).

caf · on Dec 26, 2014

An alternative without the ternary is

  return (e1->i > e2->i) - (e1->i < e2->i);

xmodem · on Dec 27, 2014

My C's a little rusty, but isn't the exact integer returned by the comparison undefined? Isn't is just defined as just zero or non-zero?

Therefore i would argue this is just as risky as the original solution, and that it's better to use the ternary operator

caf · on Dec 28, 2014

No, the relational operators (<, >, <= and >=) are defined to evaluate to either 0 or 1, and the result has type int.

The same is true of the equality operators (= and !=), the logical negation operator (!), and the logical boolean operators (&& and ||).

There are other C idioms based on this, like !!expr to squash a zero-or-non-zero expression down to zero-or-one.

tgma · on Dec 28, 2014

Regardless of the correctness of the code, if it's not obvious and you need to explain it, you probably should not write it (unless you have profile data that makes you do otherwise).

"Programs must be written for people to read, and only incidentally for machines to execute."

caf · on Dec 28, 2014

I think that particular example should be obvious to anyone whose C isn't "a little rusty".