Omnimaga
Calculator Community => TI Calculators => ASM => Topic started by: AssemblyBandit on July 06, 2013, 06:23:54 am
-
Here's a routine to divide AHL by 10. Just wondering if anyone could spot any potential problems with it. Particularly the inc L, should I inc HL to be safe?
DivAHLby10:
ld d,a
ld c,$0a
sub a
ld b,$18
DAHLLoop1:
add hl,hl
ld e,a
ld a,d
adc a,a
ld d,a
ld a,e
rla
cp c
jr c,DAHLLoop2
sub c
inc l
DAHLLoop2:
djnz DAHLLoop1
ld e,a
ld a,d
ret
-
Just wrote a program to brute force check the algorithm and it works perfectly as is. :)
-
Wow perfect, thanks! Brute force?! Can't argue with the results! How did you write it, on the calculator?
-
Because you are using add hl,hl, bit 0 of HL is always 0 by the time you get to that, so you should be fine (as verified by jacobly :P) You might be able to get better speed by doing this, though:
DivAHLby10:
ld d,a
ld bc,$180a
sub a
DAHLLoop1:
add hl,hl
rl d
rla
cp c
jr c,DAHLLoop2
sub c
inc l
DAHLLoop2:
djnz DAHLLoop1
ld e,a
ld a,d
ret
E is the remainder, AHL is the quotient. It is 4 bytes smaller and 262 t-states faster :)
-
I should have known that it would have always been zero considering I basically just rotated it left :banghead: Thanks for optimizing it, I put it into Buttonz. Just so you know, I just added some stuff to my divHlby10 routine and randomly checked, the registers I chose are arbitrary and can be changed if needed. I really care about the remainder though to display decimal, but you probably already knew that.
-
Cool! If you have inputs as EHL, then the output will be A as the remainder and EHL as the result :
DivAHLby10:
ld bc,$180a
sub a
DAHLLoop1:
add hl,hl
rl d
rla
cp c
jr c,DAHLLoop2
sub c
inc l
DAHLLoop2:
djnz DAHLLoop1
ret
That saves only 3 bytes and 12 cycles. If you want to squeeze a little more speed out of the routine without fully unrolling it, you can unrll the first 3 iterations since 3 bits will never be >=10 :
DivAHLby10:
ld bc,$150a
sub a
add hl,hl \ rl d \ rla
add hl,hl \ rl d \ rla
add hl,hl \ rl d \ rla
DAHLLoop1:
add hl,hl
rl d
rla
cp c
jr c,DAHLLoop2
sub c
inc l
DAHLLoop2:
djnz DAHLLoop1
ret
The cost is 12 bytes and you save only 87 cycles. I am trying to think of a better approach to get speed out of this.
EDIT: This routine gets a minimum of 966 tstates, average of 984.5, and max of 1002, making it almost 300 t-states faster at its slowest than the previous routine at its fastest. The downside is that it is 35 bytes, compared to the 15 it could be:
DivEHLby10:
;Inputs:
; EHL
;Outputs:
; EHL is the quotient
; A is the remainder
; D is not changed
; BC is 10
ld bc,$050a
sub a
sla e \ rla
sla e \ rla
sla e \ rla
sla e \ rla
cp c
jr c,$+4
sub c
inc e
djnz $-8
ld b,16
add hl,hl
rla
cp c
jr c,$+4
sub c
inc l
djnz $-7
ret
-
Thanks alot Xeda, now I know who to go to for code. That Pokemon Amber looks great!
-
Thanks alot Xeda, now I know who to go to for code.
I'm still not at the level of Runer112, jacobly, or calc84maniac (to name a few :P). I definitely enjoy doing this kind of coding, though!
That Pokemon Amber looks great!
Thanks! Now if it can ever get finished...