Omnimaga

General Discussion => Technology and Development => Computer Programming => Topic started by: harold on August 14, 2012, 01:46:39 pm

Title: [x86] Hiding the Else in a NOP
Post by: harold on August 14, 2012, 01:46:39 pm: I'm trying to implement 0x8000000000000000 >> nlz(x) well.

What I came up with might be a bit unorthodox:
Code: [Select]
mov r11d, 1 bsr rcx, rax jz _iszero shl r11, cl .db 0F, 1F, 80 ; nop [rax+sdword] with the sdword being the next shl _iszero: shl r11, 63 ; 49 D3 E3 3F so 4 bytesBecause BSR is retarded and returns something useless when the argument is zero, I have to handle that case with a branch. But this gets rid of the branch I'd otherwise use to skip the second shl.
An other way to do this is shl-ing by 63 in all cases (or it could be a 64bit mov) and then shr back in the nonzero case. That means xor-ing the result of bsr with 63 though - not a disaster, but more instructions.

Is there any reason not to do it this way? (besides "maintainability", I'm the only person who's ever going to read it anyway and I certainly know what this means)
Any unexpected slowdowns on some micro-architectures? Are trace caches OK with this?
Is the other way I described better?

SMF 2.0.15 | SMF © 2017, Simple Machines
Dream Portal 1.1 © 2009–2025 Dream Portal Team