Calculator Community > ASM |
Sine Approximation [Z80] |
(1/1) |
Xeda112358:
I made a neat little algorithm today that seems to do an okay job at approximating sine. My original intention was to create a routine for the following with a fixed point format where x is on [0,1) x(1-x) However, if 'x' is in 'a', i can just do a*(-a), but NEG is 2 bytes, 8 cycles, whereas CPL is 1 byte, 4 cycles. That made me wonder if I could make a fast routine to multiply A by its compliment (is it called 1s compliment?). After working in my notebook with some more abstract approaches (I used 7th degree polynomials) I reduced it to a fairly simple solution that would be much simpler to integrate into an IC or something. However, I decided to experiment a little with the result by seeing what happens if I don't let some bits to propagate into the upper 8 bits (remember, 8.8 fixed point). So basically, I computed the product of 7-th degree polynomials with binary coefficients, and then truncated any terms of degree 7 or lower, then I converted it back to binary and the result is a close approximation of x(1-x). This happens to be a close approximation of sine, too, so after some tweaking, I got it to have the same input/output range as Axe (I recall that Quigibo said he used x(1-x) as an approximation to sine, too). The following uses 3 iterations as opposed to 8 and is faster than the Axe version, but larger by 8 (?) bytes: --- Code: ---p_Cos: ld a,l add a,64 ld l,a p_Sin: ld a,l add a,a push af ld d,a ld h,a ld l,0 ld bc,037Fh __SinLoop: sla h sbc a,a xor d and c add a,l ld l,a rrc d srl c srl c djnz __SinLoop ld h,b pop af ret nc xor a sub l ret z ld l,a dec h ret ;This: ; 34 bytes ; 269 t-states min, else 282, else 294 ; avg. 76 t-states faster than Axe ;Axe: ; 27 bytes ; 341+b t-states min, else 354, else 366 --- End code --- If the bits of register 'l' are sabcdefg, this basically returns: --- Code: ---(0aaaaaaa^0bcdefg0)+(000bbbbb^000cdefg)+(00000ccc^00000def) ^ is for XOR logic, + is regular integer addition modulo 256 --- End code --- (s is the sign) The original algorithm was going to be for computing the natural logarithm :P I came up with some new (to me) algorithms for that as well. EDIT1: Thanks to calc84maniac for pointing out the optimisation using a different order of operations with xor/and ! Saved 1 byte, 12 cycles. This let me shuffle some code to save 8 more cycles. EDIT2: Modified the last note to reflect the normalized input/output to match that of Axe. |
calc84maniac:
I think I spy an optimization. You can do the AND C after the XOR instead of applying it to both inputs. That should save at least one byte and 12 cycles, or maybe more if that somehow leads to more improvements. |
Xeda112358:
Thanks! Hopefully I can find more tomorrow. This did let me remove the "ld d,a \ stuff \ ld a,d" and just put the "ld d,a" outside the loop and use "rrc d" instead of "ld a,d \ rrca". It saves 8 t-states, 0 bytes. |
Xeda112358:
Necropost: this came up elsewhere and I made a python version if anybody is interested. I think it could be implemented really well with logic gates! https://gist.github.com/Zeda/98aab98fd32a231b878772a435ffb306 --- Code: ---#!/usr/bin/python3 # This is ported from: # https://www.omnimaga.org/asm-language/sine-approximation-(z80)/ # (Xeda112358 a.k.a. Zeda a.k.a. ZedaZ80) # from math import sin def sineb(x): """Approximates 128*sin(x*2*pi/256)""" a1 = int(x) a0 = a1 << 1 a2 = a1 >> 1 # iteration 1 if a1 & 64: l = ~a0 & 127 else: l = a0 & 127 # iteration 2 if a1 & 32: l += ~a1 & 31 else: l += a1 & 31 # iteration 3 if a1 & 16: l += ~a2 & 7 else: l += a2 & 7 # check if it needs to be negated if a1 & 128: return -l else: return l # Plot a graph of the approximation vs the actual for x in range(0, 256, 2): y = sineb(x) z = int(sin(x*3.1415926535/128)*128) # translate and scale for "graphing" y += 128 z += 128 y >>= 1 z >>= 1 # "graph" # X - Approximation # O - Actual # 8 - used when the graphs overlap if y == z: print(" "*y + "8") elif y > z: print(" "*z + "X" + " "*(y-z-1) + "O") else: print(" "*y + "O" + " "*(z-y-1) + "X") --- End code --- |
Navigation |
Message Index |