My way to compute Geometric Means (and square roots)

General Discussion > Math and Science

<< < (2/2)

aeTIos:
Aww I was planning to calculate the root of a super huge prime number :P

Xeda112358:
Necro-update.

I was going to go to sleep a couple hours ago, but as soon as I turned out the lights, for whatever reason, I thought of a way to tweak this algorithm to be even faster on the Z80. I've spent the last couple of hours refining my algorithm. I'll refrain from a full mathematical proof, because 1] I'm tired, and 2] I haven't made a formal proof yet :P

The principal is the same: We want some algorithm for making xn and yn converge to some "r". We also need to be able to relate xn and xn to the original x and y.

One further constraint is to use very, very simple operations (shift-and-add).

If xn<yn, and I define
xn+1=xn(1+2-kn)2
yn+1=yn(1-2-kn)2
then we have that xn+1yn+1=xnyn(1-4-kn)2

So if we keep track of the factors (1-4-kn)2, we can just divide them out of the final result. So basically, start with a constant factor of 1. Each iteration, multiply that by your sqrt((1-4-kn)2)=(1-4-kn). After all of the iterations are done, we have xn=yn and xy(c2)=xnyn=xn2. So then:
xy(c2)=xn2
xy=xn2/c2
sqrt(xy)=xn/c

The algorithm below appears to require n iterations to compute 4n bits of accuracy.

--- Code: ---;think of all values here as 2.m fixed point numbers.
INPUTS: X,Y are on [1,4)
c=1
iterate n times
a=y-x
if underflow
a=-a
(x,y) = (y,x) //swap x and y, so x<y
n=-2+intlg(a) //intlg(a) basically finds the negative of the index of the first '1' bit in the binary expansion of a. Basically, the "clz" instruction for some processors.
c-=c>>2n
x+=x>>n
x+=x>>n
y-=y>>n
y-=y>>n
return x/c

--- End code ---

c4ooo:

--- Quote from: aeTIos on March 31, 2014, 05:06:15 pm ---You're just a math goddess. I don't understand 50% of what you're saying here but whatever, might come in handy one time :D

--- End quote ---
^That *.*
And btw just interested, is that ti-basic code faster then just √(x*y)?

Xeda112358:
It almost certainly won't run faster (I haven't tried in a while). It's only useful in situations where excessively large numbers are used (the BASIC version).

However, if you are working in a 32-bit environment, there is no risk of overflowing into a 64-bit int via multiplication, so all operations can take place with 32 bits.

The algorithm is best suited for a machine that can perform arbitrary shifts (the Z80 can only shift one place to the left or right) and even better if it has a "clz" instruction. Then it averages roughly 5 shift-and-adds, 1 clz, and 1 subtraction for each 4 bits of accuracy.

The algorithm is best suited for fixed point math in the form of 2.n (so 2 integer bits, n fractional bits). This can be easily achieved for floats where the exponent can be set aside for later, and the mantissa is almost always already in a 1.n form. For integers, you just need to shift left an even number of times until the upper 2 bits are non-zero (so 01, 10,11).

Navigation

[0] Message Index

[*] Previous page

Go to full version