Calculator Community > ASM

Sine Approximation [Z80]

(1/1)

Xeda112358:
I made a neat little algorithm today that seems to do an okay job at approximating sine. My original intention was to create a routine for the following with a fixed point format where x is on [0,1)
x(1-x)

However, if 'x' is in 'a', i can just do a*(-a), but NEG is 2 bytes, 8 cycles, whereas CPL is 1 byte, 4 cycles. That made me wonder if I could make a fast routine to multiply A by its compliment (is it called 1s compliment?). After working in my notebook with some more abstract approaches (I used 7th degree polynomials) I reduced it to a fairly simple solution that would be much simpler to integrate into an IC or something. However, I decided to experiment a little with the result by seeing what happens if I don't let some bits to propagate into the upper 8 bits (remember, 8.8 fixed point). So basically, I computed the product of 7-th degree polynomials with binary coefficients, and then truncated any terms of degree 7 or lower, then I converted it back to binary and the result is a close approximation of x(1-x). This happens to be a close approximation of sine, too, so after some tweaking, I got it to have the same input/output range as Axe (I recall that Quigibo said he used x(1-x) as an approximation to sine, too).

The following uses 3 iterations as opposed to 8 and is faster than the Axe version, but larger by 8 (?) bytes:

--- Code: ---p_Cos:
ld a,l
add a,64
ld l,a
p_Sin:
ld a,l
add a,a
push af
ld d,a
ld h,a
ld l,0
ld bc,037Fh
__SinLoop:
sla h
sbc a,a
xor d
and c
add a,l
ld l,a
rrc d
srl c
srl c
djnz __SinLoop
ld h,b
pop af
ret nc
xor a
sub l
ret z
ld l,a
dec h
ret
;This:
;  34 bytes
;  269 t-states min, else 282, else 294
;  avg. 76 t-states faster than Axe
;Axe:
;  27 bytes
;  341+b t-states min, else 354, else 366

--- End code ---

If the bits of register 'l' are sabcdefg, this basically returns:

--- Code: ---(0aaaaaaa^0bcdefg0)+(000bbbbb^000cdefg)+(00000ccc^00000def)

^ is for XOR logic, + is regular integer addition modulo 256

--- End code ---
(s is the sign)

The original algorithm was going to be for computing the natural logarithm :P I came up with some new (to me) algorithms for that as well.

EDIT1: Thanks to calc84maniac for pointing out the optimisation using a different order of operations with xor/and ! Saved 1 byte, 12 cycles. This let me shuffle some code to save 8 more cycles.
EDIT2: Modified the last note to reflect the normalized input/output to match that of Axe.

calc84maniac:
I think I spy an optimization. You can do the AND C after the XOR instead of applying it to both inputs. That should save at least one byte and 12 cycles, or maybe more if that somehow leads to more improvements.

Xeda112358:
Thanks! Hopefully I can find more tomorrow.

This did let me remove the "ld d,a \ stuff \ ld a,d" and just put the "ld d,a" outside the loop and use "rrc d" instead of "ld a,d \ rrca". It saves 8 t-states, 0 bytes.

Xeda112358:
Necropost: this came up elsewhere and I made a python version if anybody is interested. I think it could be implemented really well with logic gates!
https://gist.github.com/Zeda/98aab98fd32a231b878772a435ffb306


--- Code: ---#!/usr/bin/python3

# This is ported from:
#    https://www.omnimaga.org/asm-language/sine-approximation-(z80)/
# (Xeda112358  a.k.a.  Zeda  a.k.a.  ZedaZ80)
#

from math import sin

def sineb(x):
    """Approximates 128*sin(x*2*pi/256)"""

    a1 = int(x)
    a0 = a1 << 1
    a2 = a1 >> 1

    # iteration 1
    if a1 & 64:
        l = ~a0 & 127
    else:
        l =  a0 & 127

    # iteration 2
    if a1 & 32:
        l += ~a1 & 31
    else:
        l +=  a1 & 31

    # iteration 3
    if a1 & 16:
        l += ~a2 & 7
    else:
        l +=  a2 & 7

    # check if it needs to be negated
    if a1 & 128:
        return -l
    else:
        return l

# Plot a graph of the approximation vs the actual
for x in range(0, 256, 2):
    y = sineb(x)
    z = int(sin(x*3.1415926535/128)*128)

    # translate and scale for "graphing"
    y += 128
    z += 128
    y >>= 1
    z >>= 1

    # "graph"
    #    X - Approximation
    #    O - Actual
    #    8 - used when the graphs overlap
    if y == z:
        print(" "*y + "8")
    elif y > z:
        print(" "*z + "X" + " "*(y-z-1) + "O")
    else:
        print(" "*y + "O" + " "*(z-y-1) + "X")

--- End code ---

Navigation

[0] Message Index

Go to full version