Sorry, I'm on my phone so I'll probably not go too in-depth on this Bug me for details if I don't get around to it and you need them

So: Given ##x\in[-.5ln2,.5ln2]## Let ##y=x^{2}## Let ##a=\frac{x}{2}\frac{1+\frac{5y}{156}\left(1+\frac{3y}{550}\left(1+\frac{y}{1512}\right)\right)}{1+\frac{3y}{26}\left(1+\frac{5y}{396}\left(1+\frac{y}{450}\right)\right)}## Then ##e^{x}\approx\frac{1+a}{1-a}##

Accuracy is ~75.9666 bits. 7 increments, 1 decrement 6 constant multiplications 6 general multiplications 2 general divisions 1 div by 2 (right shift or decrement an exponent)

For comparison, that's comparable to 16 terms of the Taylor's series, or 8 terms of the standard Padé expansion (exponential is special in that it comes out to f(x)/f(-x) so it can be done even easier than most).

I basically carried out a Padé expansion for e^x to infinity, noticed that after the constant term all the even coefficients were zero, so I used a Padé expansion on that function to quickly find our approximation for a.

In my usage, I actually implemented a 2^{x} function since I'm using binary floats with 64-bit precision. I take int(x) and save that for a final exponent for the float. Remove that value from x. By definition of int(), x is now non-negative. If x≥.5, increment that saved exponent, subtract 1 from x. Now x is on [-.5,.5]. Now we need to perform 2^{x}, but that is equivalent to e^{x*ln(2)}. We've effectively applied range reduction to our original input and now we can rely on the algorithm at the top. That integer part that we saved earlier now gets added to the exponent byte, et voilà !

I think when I calculated max error using my floats it was accurate up to the last two bits or something. Don't quote me on that as I don't have those notes on me at the moment and it was done months ago.

Hey Art! It's been the same way for me, too. Adulthood and social media... ugh I love programming and do program on my calc most days, but it's usually just a couple of minutes here and there throughout the day I just don't have time for projects. I couldn't tell if I wasn't seeing you around or if it was because I don't check in regularly

I'm also guessing division would be the subtraction instead of addition, trunkating any remainder.

Division is typically performed exactly like 'schoolbook' long division (until you get to higher precision, then you can use some state-of-the-art algorithms and math magic).

Start with an accumulator and quotient set to 0. Rotate the one bit of the numerator into the accumulator. If accumulator>=denominator, then subtract the denominator from the accumulator and shift in a 1 to the quotient, else shift in a zero Repeat at the "rotate" step.

In code, since we are getting rid of one bit at a time in the numerator and adding 1 bit at a time to the quotient, we can actually recycle the freed up bits in the numerator. Here is an example of HL/C where C<128, A is the accumulator and HL doubles as the quotient and numerator:

HL_div_C: ;Input: ; HL is the numerator ; C is the denominator. C<128 ;Output: ; A is the remainder ; HL is the quotient. xor a ld b,16 loop: add hl,hl ;this works like shifting HL left by 1. Overflow (thus, the top bit) is left in the carry flag rla ;shift a left, rotating in the c flag as the low bit. cp c ;compare a to c. Basically does A-C and returns the flags. If C>A, then there will be underflow setting the C flag. jr c,skip ;skip the next two bytes if c flag is set (so A<C) inc l ;we know the low bit of HL is 0, so incrementing HL will set that bit. sub c skip: djnz loop ret

ld b,8 - This must be telling the loop to repeat 8 times to check each bit, so, how does the loop relate to b? I'm thinking about how nested loops would work...

djnz loop - kindof like "goto" loop, with the automatic decrementing of register b?

That is correct and that's why we initially do ld b,8. Keep in mind that djnz * and jr * are intended for small (ish) loops or redirecting to code relatively close by. It can jump backwards up to 126 bytes and forward up to 128 (back 1 byte is simply a slower way of performing rst 38h and I do not suggest it, back 2 bytes creates an infinite loop, back 0 bytes does nothing but waste 7 or 9 clock cycles). Most assemblers will warn you of out-of-bounds jumps. For longer loops or jumping to code far away, use jp *. djnz * only works with register b and always decrements.

rrc e - you say this rotates register e, is this the same as changing the register input from left to right?

For example, if e=01111011, then rrc e would change it to e=10111101. The bottom bit gets returned in the c flag as well as being returned to the top bit of e. I used rrc * since rotating 8 times would leave it exactly how it was input. I could have used rr * or even sra * or srl *, but I prefer to avoid destroying registers if I can.

But the multiplication isn't convoluted! It's exactly how most of us are taught in grade school, except instead of multiplying digits 0~9, it's just multiplying by 0 or 1 which is super trivial. Like: 01110101 x10101101 --------- 01110101 000000000 0111010100 01110101000 000000000000 0111010100000 00000000000000 011101010000000

Or removing the multiplies by zero: 01110101 x10101101 --------- 01110101 0111010100 01110101000 0111010100000 011101010000000

So suppose bit 3 is set. Then you basically add your top number, shifted left three times. As an example, suppose you wanted to multiply C*E (ignoring the top 8 bits):

;C is our "top" number. ;E is our "bottom" number. ;A will be our "accumulator"

ld a,0

rrc e ;this rotates register 'e' right, putting the bottom bit as "carry" [out of the register]. jr nc,checkbit1 ;nc == not carry. If "carry" out was zero, skip this step. add a,c ;if carry out was 1, then add to our accumulator. checkbit1: sla c ;finally, shift our "top number" to the left in case we need to add this to the accumulator, too. Then [REPEAT] 7 more times. rrc e jr nc,checkbit2 add a,c checkbit2: sla c rrc e jr nc,checkbit3 add a,c checkbit3: sla c rrc e jr nc,checkbit4 add a,c checkbit4: sla c rrc e jr nc,checkbit5 add a,c checkbit5: sla c rrc e jr nc,checkbit6 add a,c checkbit6: sla c rrc e jr nc,checkbit7 add a,c checkbit7: sla c rrc e jr nc,all_done add a,c all_done: ret If you can see how that relates to the school book algorithm, then just know that the following does practically the same thing:

;C is our "top" number. ;E is our "bottom" number. ;A will be our "accumulator"

xor a ;mad hax to set A to zero. Faster, smaller. ld b,8 loop: rrc e ;this rotates register 'e' right, putting the bottom bit as "carry" [out of the register]. jr nc,no_add ;nc == not carry. If "carry" out was zero, skip this step. add a,c ;if carry out was 1, then add to our accumulator. no_add: sla c ;finally, shift our "top number" to the left in case we need to add this to the accumulator, too. djnz loop ;aaand repeat, decrementing register B until zero (this is a specialized instruction on the Z80) ret

But please don't use that in a real program It's terribly inefficient. If you understand how the register pairing works, you can come up with a much better AND more convoluted algorithm:

Spoiler For "Step-by-Step How to derive the 'best' 8-bit Multiplication Algorithm":

Let's start by rearranging the above code. The way we do 'schoolbook' multiplication starts at the least significant digit, but we can just as easily start from the most significant digit. So let's do an example in base 10: 377 x613 ---- =1*3*377+10*1*377+100*6*377 =1(3*377+10(1*377+10(6*377)))

If we want to convert that last line to pseudo-code:

H*E -> A ;H is the "bottom" number that we will no be checking the top digit down to the bottom digit. ;E is the 'Top" number. ;A is the accumulator ;basic algo, after initializing A to zero. ; multiply A by 2. ; shift H left by 1 ; if this results in a bit carried out (so a 1 carried out), then add E ('top' number) to A (the accumulator) ; repeat 7 more times for all bits in H. xor a ld b,8 loop: add a,a sla h ;shifts H left by 1, bringing in a 0 for the low bit. mathematically the same as H*2 -> H jr nc,skip_add add a,e skip_add: djnz loop ret That's faster, but not optimal! To get the optimal way, lets stray from optimality a little to make 'L' our accumulator. Since we can't directly add another register to L, we'll have to juggle with register A making it slower:

ld l,0 ld b,8 loop: sla l sla h jr nc,skip_add ld a,l \ add a,e \ ld l,a skip_add: djnz loop ret But since we know that 'sla l' will spit out a zero-bit for the first 8 iterations (all of them), we can do 'sla l \ rl h' which is the same size and speed. However, this is the same as just doing doing "add hl,hl" ! This is where you'll have to learn how register pairs work

ld l,0 ld b,8 loop: add hl,hl jr nc,skip_add ld a,l \ add a,e \ ld l,a skip_add: djnz loop ret But wait, there is more! We don't have an "add l,e" instruction, but we do have an "add hl,de" instruction. If we make D==0, then we can change 'ld a,l \ add a,e \ ld l,a' to 'add hl,de'. The problem is, if the lower byte of HL, (so L) overflows, then the upper byte H, our 'bottom' number. We can't have it changing our input value halfway through the algorithm! Thankfully, the changes never propagate into those upper bits. This requires some tedious work to prove, but if you are cool with taking that at face value, then our last piece of the puzzle gives us:

ld d,0 ld l,d ;since D=0 already, this sets L to 0, as we want. It's smaller and faster than ld l,0. ld b,8 loop: add hl,hl jr nc,skip_add add hl,de skip_add: djnz loop ret Even better, this actually givers us the full 16-bit result instead of just the lower 8 bits

Of course assembly would be way faster, but have you tried the standard built in Pxl-On(, Pxl-Off(, Pxl-Change(, Pt-On(, Pt-Off(, Pt-Change(, Line(, and Circle( commands? Also, if you include an extra argument for Circle( of {i it will perform faster (i being the imaginary i).

If you haven't tried Axe yet, I urge you to try that if you want much faster graphics. Just be warned that Axe works a lot closer to assembly, so you can't just press [ON] to break out of infinite loops. For example, if you do While 1:End, you will need to pull a battery and get a RAM clear.

Edit: Also, if you want a pixel based circle (instead of TIs point-based), I think KermM wrote an entirely TI-BASIC routine that performs faster than TIs.

This is.....odd. Which calculator? Did you have an item selected? Which one? Was it a bottle? What was inside the bottle?

TI-84+SE OS 2.55MP (shush, I use it for the extra built in functions ) I tried it at various points in the game including with water selected. It also crashes when I go to use magic, then press down. And occasionally if I die and hit [2nd] too quickly, it glitches to an odd map area and when I move it causes it to crash.

I haven't played an awful lot, but it's fun so far! I have one major bug to report, though: Whenever I press the down arrow when I'm charged for an attack, I get an error (Err:Link I believe) and it causes my calc to freeze and I have to pull a battery.