Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ARM (quite common these days) will sometimes do this as well, as the immediate value of an instruction has to fit inside of the fixed-width instruction, and the instruction is only itself 32-bits (some required for overhead). While it is slightly more likely you will see a PC-relative load, that requires an extra data fetch; so, instead, a "movt" (move top) instruction is used to set the upper 16-bits of a register after first setting the lower 16-bits. This requires just as much space as the load and separate data literal, and runs entirely out of the instruction cache.


What I saw an ARM compiler doing once is putting the 0xDEADBEEF into a data segment (in the 'text' area I think) then loading from there

Like it would load a string, but only 4 bytes, it can load in one step


Not certain if you are disagreeing, so to clarify: you can either do a load like that or two moves. The load requires one instruction word and one data word, so two words. The moves require two instruction words, so also two words. The load however, requires a data fetch. This fetch has to come from somewhere nearby, as the load instruction cannot offset very far, and so is going to be in the same segment as the code (called "text", for historic reasons: this has nothing to do with strings, which will be stored in the "data" segment, or sometimes even a "strings" segment), and most of the time will be directly after (or, for long functions, inside) of the code for that function. This is a trade off: moves are fast (cycle counts of instructions are not all the same, so "single step" doesn't mean much), have better guarantees that they will always have the same result (code changing has much heavier penalties than data changing), and might use up an asynchronous memory access that other parts of your code might be saturating (although I honestly am not certain if ARM does this like other architectures do). It thereby comes down to circumstance, configuration, and the whims of the people who wrote your compiler as to whether you will get mov+movt or a single ldr pc,.X+data.


I'm not disagreeing, just mentioning

"It thereby comes down to circumstance, configuration, and the whims of the people who wrote your compiler as to whether you will get mov+movt or a single ldr pc,.X+data."

Yes, this sums it nicely




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: