Omnimaga
Calculator Community => TI Calculators => ASM => Topic started by: Matrefeytontias on July 01, 2013, 12:37:58 pm
-
Hello people,
I need a HLbyDE signed division, but I can't find one on Google ...
Does any of you have that ?
-
Maybe you can have a look at the one used by Axe ? I believe there is an unsigned routine and a signed routine (one using "/" and the other one using "//").
-
Yeah it's the case, but they're embedded with other routines to work with Axe, so ...
-
The easiest way to do it, probably, is to just use a regular division routine and check for the sign of the input:
ld a,d
xor h ;now the sign flag is set if the result needs to be negative or positive.
push af
xor h ;now a=d again, but we have the sign flag set appropriately
jp p,$+9 ;if the sign is minus, negate DE
xor a
sub e
ld e,a
sbc a,a ;c flag is set, so this makes A=$FF
sub d
ld d,a
ld a,h
or a
jp p,$+9
xor a
sub l
ld l,a
sbc a,a ;c flag is set, so this makes A=$FF
sub h
ld h,a
call HL_Div_DE ;return the result in HL
pop af
ret p
xor a
sub l
ld l,a
sbc a,a
sub h
ld h,a
ret
In the worst case, it is 141 t-states + however long it takes for the regular division.
141 t-states extra if one input is negative
137 t-states extra if both are negative
89 t-states extra if both inputs are positive
-
Okay, thanks for this quick answer, I'll use that :)
-
Are you cooking up some more 3D stuff? :).
I only ask because sometimes it is faster to multiply by a reciprocal than do a division.
-
Meh, stop always guessing right ;D
I'll post about it later, don't worry.
-
EDIT: Fixed a problem for when HL=8000h.
I have this routine that is larger, but faster than my other division routines:
;===============================================================
HL_Div_BC_Signed:
;===============================================================
;Performs HL/BC
;Speed: 1350-55a-2b
; b is the number of set bits in the result
; a is the number of leading zeroes in the absolute value of HL, minus 1
; add 24 if HL is negative
; add 19 if BC is negative
; add 28 if the result is negative
;Size: 68 bytes
;Inputs:
; HL is the numerator
; BC is the denominator
;Outputs:
; DE is the quotient
; HL is the remainder
; BC is not changed
;Destroys:
; A
;===============================================================
ld a,h
xor b
push af
;absHL
xor b
jp p,$+9
xor a \ sub l \ ld l,a
sbc a,a \ sub h \ ld h,a
;absBC:
bit 7,b
jr z,$+8
xor a \ sub c \ ld c,a
sbc a,a \ sub b \ ld b,a
ld de,0
adc hl,hl
jr z,EndSDiv
ld a,16
add hl,hl
dec a
jp nc,$-2
ex de,hl
jp jumpin
Loop1:
add hl,bc ;--
Loop2:
dec a ;4
jr z,EndSDiv ;12|23
sla e ;--
rl d ;--
jumpin: ;
adc hl,hl ;15
sbc hl,bc ;15
jr c,Loop1 ;23-2b ;b is the number of bits in the absolute value of the result.
inc e ;--
jp Loop2 ;--
EndSDiv:
pop af \ ret p
xor a \ sub e \ ld e,a
sbc a,a \ sub d \ ld d,a
ret
So at its slowest, it is 1396 clock cycles and at its fastest (for non-trivial division) it is can get as low as 572 t-states for non-trivial division.
EDIT: I think I messed up the clock cycle count (but not by much). Anyways, here is a cleaner version that is smaller and more predictable. 1315-2b t-states where b is the number of bits set in the absolute value of the result:
EDIT2: The timings above are correct, now.
;===============================================================
HL_Div_BC_Signed:
;===============================================================
;Performs HL/BC
;Speed: 1315-2b cycles
; **same cycles as the regular HL_Div_BC
; add 24 if HL is negative
; add 24 if BC is negative
; add 28 if only one is negative
;Size: 60 bytes
;Inputs:
; HL is the numerator
; BC is the denominator
;Outputs:
; DE is the quotient
; HL is the remainder
; BC is not changed
; A is 0
; z flag is set
; c flag is reset
;===============================================================
ld a,h
xor b
push af
;absHL
xor b
jp p,$+9
xor a \ sub l \ ld l,a
sbc a,a \ sub h \ ld h,a
;absBC:
xor b
jp p,$+9
xor a \ sub c \ ld c,a
sbc a,a \ sub b \ ld b,a
add hl,hl
ld a,15
ld de,0
ex de,hl
jp jumpin
;89+15*80+26 = 1315-2b
Div_Loop_1:
add hl,bc ;--
Loop2:
dec a ;4
jr z,EndSDiv ;12|7
jumpin:
sla e ;8
rl d ;8
adc hl,hl ;15
sbc hl,bc ;15
jr c,Loop1 ;23-2b
inc e ;--
jp Loop2 ;--
EndSDiv:
pop af \ ret p
xor a \ sub e \ ld e,a
sbc a,a \ sub d \ ld d,a
ret
-
Impressive work Xeda :).
-
It seems that the second routine is wrong ... it gives me wrong results.
-
Wait, but does the first routine work?
Maybe I made a typo with the second one...
EDIT: Hmm, both seem to work for me, but there could be certain vales that cause problems. What values did you use?
For example, I did -16/3 and it returned -5 (65531) with a remainder of 1.