1
Math and Science / Re: A faster Newton's Method Square Root
« on: Yesterday at 10:11:37 pm »
I fixed the link, thanks 
For TI Floats, the first byte is actually used to indicate variable type (lower 5 bits), and for floats, TI uses the upper bit to indicate sign. The next byte is used for the exponent, and then 7 bytes for the 14 digits, for a total of 9 bytes.
TI also uses extra precision for some intermediate operations, so sometimes the floats are actually larger (which is why the OPx spots in RAM are 11 bytes instead of 9).TI uses BCD because numbers are entered in base 10, and they want that to be handled "exactly," especially for financial usage. (BCD has the same rounding issues as binary floats, it just happens at different places).
I'm personally not a fan of BCD, as it is wasteful and slow, and binary floats can do the job. The only advantage with BCD floats is that with financial math, you don't have to be careful with comparisons.
As for the speed of that 32-bit square root routine, remember that it is fully unrolled, and relies on a fully unrolled 16-bit square root for a grand total of about 270 bytes. Also note: the routine I linked to is an integer square root routine. The floating-point routine is probably closer to 2000cc as it needs an extra 8 bits of precision in the result and some special handling for the exponent and "special values." That said, yes, there are some advantages to the floating-point format, especially with division and square roots.
EDIT: Actually, the original link to sqrtHLIX doesn't even use this technique, so I updated it to point to sqrt32 which does use this technique, and expects the upper two bits of the input to be non-zero (which is guaranteed with these binary floats)

For TI Floats, the first byte is actually used to indicate variable type (lower 5 bits), and for floats, TI uses the upper bit to indicate sign. The next byte is used for the exponent, and then 7 bytes for the 14 digits, for a total of 9 bytes.
TI also uses extra precision for some intermediate operations, so sometimes the floats are actually larger (which is why the OPx spots in RAM are 11 bytes instead of 9).TI uses BCD because numbers are entered in base 10, and they want that to be handled "exactly," especially for financial usage. (BCD has the same rounding issues as binary floats, it just happens at different places).
I'm personally not a fan of BCD, as it is wasteful and slow, and binary floats can do the job. The only advantage with BCD floats is that with financial math, you don't have to be careful with comparisons.
As for the speed of that 32-bit square root routine, remember that it is fully unrolled, and relies on a fully unrolled 16-bit square root for a grand total of about 270 bytes. Also note: the routine I linked to is an integer square root routine. The floating-point routine is probably closer to 2000cc as it needs an extra 8 bits of precision in the result and some special handling for the exponent and "special values." That said, yes, there are some advantages to the floating-point format, especially with division and square roots.
EDIT: Actually, the original link to sqrtHLIX doesn't even use this technique, so I updated it to point to sqrt32 which does use this technique, and expects the upper two bits of the input to be non-zero (which is guaranteed with these binary floats)