Author Topic: Sine Approximation [Z80]  (Read 2617 times)

0 Members and 1 Guest are viewing this topic.

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4680
  • Rating: +718/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Sine Approximation [Z80]
« on: January 25, 2014, 08:50:07 pm »
I made a neat little algorithm today that seems to do an okay job at approximating sine. My original intention was to create a routine for the following with a fixed point format where x is on [0,1)
x(1-x)

However, if 'x' is in 'a', i can just do a*(-a), but NEG is 2 bytes, 8 cycles, whereas CPL is 1 byte, 4 cycles. That made me wonder if I could make a fast routine to multiply A by its compliment (is it called 1s compliment?). After working in my notebook with some more abstract approaches (I used 7th degree polynomials) I reduced it to a fairly simple solution that would be much simpler to integrate into an IC or something. However, I decided to experiment a little with the result by seeing what happens if I don't let some bits to propagate into the upper 8 bits (remember, 8.8 fixed point). So basically, I computed the product of 7-th degree polynomials with binary coefficients, and then truncated any terms of degree 7 or lower, then I converted it back to binary and the result is a close approximation of x(1-x). This happens to be a close approximation of sine, too, so after some tweaking, I got it to have the same input/output range as Axe (I recall that Quigibo said he used x(1-x) as an approximation to sine, too).

The following uses 3 iterations as opposed to 8 and is faster than the Axe version, but larger by 8 (?) bytes:
Code: [Select]
p_Cos:
ld a,l
add a,64
ld l,a
p_Sin:
ld a,l
add a,a
push af
ld d,a
ld h,a
ld l,0
ld bc,037Fh
__SinLoop:
sla h
sbc a,a
xor d
and c
add a,l
ld l,a
rrc d
srl c
srl c
djnz __SinLoop
ld h,b
pop af
ret nc
xor a
sub l
ret z
ld l,a
dec h
ret
;This:
;  34 bytes
;  269 t-states min, else 282, else 294
;  avg. 76 t-states faster than Axe
;Axe:
;  27 bytes
;  341+b t-states min, else 354, else 366

If the bits of register 'l' are sabcdefg, this basically returns:
Code: [Select]
(0aaaaaaa^0bcdefg0)+(000bbbbb^000cdefg)+(00000ccc^00000def)

^ is for XOR logic, + is regular integer addition modulo 256
(s is the sign)

The original algorithm was going to be for computing the natural logarithm :P I came up with some new (to me) algorithms for that as well.

EDIT1: Thanks to calc84maniac for pointing out the optimisation using a different order of operations with xor/and ! Saved 1 byte, 12 cycles. This let me shuffle some code to save 8 more cycles.
EDIT2: Modified the last note to reflect the normalized input/output to match that of Axe.

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2898
  • Rating: +467/-17
    • View Profile
    • TI-Boy CE
Re: Sine Approximation [Z80]
« Reply #1 on: January 25, 2014, 09:17:35 pm »
I think I spy an optimization. You can do the AND C after the XOR instead of applying it to both inputs. That should save at least one byte and 12 cycles, or maybe more if that somehow leads to more improvements.
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4680
  • Rating: +718/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: Sine Approximation [Z80]
« Reply #2 on: January 25, 2014, 09:27:38 pm »
Thanks! Hopefully I can find more tomorrow.

This did let me remove the "ld d,a \ stuff \ ld a,d" and just put the "ld d,a" outside the loop and use "rrc d" instead of "ld a,d \ rrca". It saves 8 t-states, 0 bytes.