Omnimaga

General Discussion => Technology and Development => Computer Programming => Topic started by: harold on August 14, 2012, 01:46:39 pm

Title: [x86] Hiding the Else in a NOP
Post by: harold on August 14, 2012, 01:46:39 pm
I'm trying to implement 0x8000000000000000 >> nlz(x) well.

What I came up with might be a bit unorthodox:
Code: [Select]
 mov r11d, 1
  bsr rcx, rax
  jz _iszero
  shl r11, cl
  .db 0F, 1F, 80 ; nop [rax+sdword] with the sdword being the next shl
_iszero:
  shl r11, 63  ; 49 D3 E3 3F so 4 bytes
Because BSR is retarded and returns something useless when the argument is zero, I have to handle that case with a branch. But this gets rid of the branch I'd otherwise use to skip the second shl.
An other way to do this is shl-ing by 63 in all cases (or it could be a 64bit mov) and then shr back in the nonzero case. That means xor-ing the result of bsr with 63 though - not a disaster, but more instructions.

Is there any reason not to do it this way? (besides "maintainability", I'm the only person who's ever going to read it anyway and I certainly know what this means)
Any unexpected slowdowns on some micro-architectures? Are trace caches OK with this?
Is the other way I described better?