Author Topic: The Optimization Compilation  (Read 90960 times)

0 Members and 1 Guest are viewing this topic.

Offline Quigibo

  • The Executioner
  • CoT Emeritus
  • LV11 Super Veteran (Next: 3000)
  • *
  • Posts: 2031
  • Rating: +1075/-24
  • I wish real life had a "Save" and "Load" button...
    • View Profile
Re: The Optimization Compilation
« Reply #75 on: February 06, 2011, 09:09:55 pm »
Its not an auto-optimization, but it is a significant one.

To have a loop with a conditional at the end saves 3 bytes from a loop with conditional checking at the beginning. For instance:
Code: [Select]
Repeat getKey(15)
...code...
End

Can become:
Code: [Select]
While 1
...code...
EndIf getKey(15)

You can't do this with every loop since a lot of them need to check before entering the loop, but with situations like this example, it probobly doesn't matter.  The EndIf can actually be used on any loop structure though, including do loops, while loops, repeat loops, and for loops.  A do loop, which is also new, is just a "While 1" or "Repeat 0" which automatically get optimized to not do the check at all since they always loop.
« Last Edit: February 06, 2011, 09:10:51 pm by Quigibo »
___Axe_Parser___
Today the calculator, tomorrow the world!

Offline ztrumpet

  • The Rarely Active One
  • CoT Emeritus
  • LV13 Extreme Addict (Next: 9001)
  • *
  • Posts: 5712
  • Rating: +364/-4
  • If you see this, send me a PM. Just for fun.
    • View Profile
Re: The Optimization Compilation
« Reply #76 on: February 06, 2011, 09:38:46 pm »
Cool!  I love post test loops! ;D

Offline squidgetx

  • Food.
  • CoT Emeritus
  • LV10 31337 u53r (Next: 2000)
  • *
  • Posts: 1881
  • Rating: +503/-17
  • rawr.
    • View Profile
Re: The Optimization Compilation
« Reply #77 on: February 13, 2011, 04:29:44 pm »
Pixel Testing

Pixel testing can be a mean and nasty cycle stealer from many programs. But never fear, it can be optimized...a lot. Remember that we have access to the screen buffer in L6.

If you are pixel testing a constant pixel, like pxl-Test(20,20), you can more than halve the speed of this command with the following optimization:
Code: [Select]
{20*12+L6+2}^^r e4 This optimization relies on the fact that the numbers can basically be pre-computed: use the following formula to derive the numbers you should use:
Code: [Select]
{Y*12+L6+(X/8)}e(X^8) So for another example, the command pxl-Test(8,1) becomes {12+L6}e1.

The speed gain from this is so great that you can even still save (although not as much) even with a variable Y value. How you treat the constant X value remains the same as before, but simply substitute in your variable Y value in the above code. So for example, pxl-Test(31,Y) becomes {Y*12+L6+3}e7.

Edit: here are the numbers if anyone wants them
pxl-Test is 53 bytes, ~237 cycles, plus 66 to call. Then we have to load two constants, adding 20 cycles. Let's make a conservative estimate and say 320 cycles.
{} is 14 cycles. Loading a constant into it is 10 cycles. The fastest bit-check is e7 which is 20 cycles, while the slowest is e2 which is 49 cycles. So we're in the range of 44-73 cycles, less than a third of pxl-Test() O.o

For variable Y, pxl-Test(Y,constant) is about 326 cycles. Now, loading a var is 16 cycles, *12 is 52, plus a constant is 21 cycles. The speed of this is 123-152 cycles, still more than twice as fast.

I suspect that variable X and Y with this method is about the same speed as regular pxl-testing.
« Last Edit: February 15, 2011, 07:19:19 am by squidgetx »

Offline NinjaKnight

  • LV2 Member (Next: 40)
  • **
  • Posts: 20
  • Rating: +2/-0
  • Hey! Look behind you! It's a -- (runs)
    • View Profile
Re: The Optimization Compilation
« Reply #78 on: February 13, 2011, 10:17:33 pm »
Is using Rect( to draw straight horizontal/vertical lines faster than Line( ?
Ninja vs. Chuck Norris == n/0. Both end with the world blowing up.

"We could use up two eternities in learning all that is to be learned about our own world and the thousands of nations that have arisen and flourished and vanished from it. Mathematics alone would occupy me eight million years."
-Mark Twain

Offline Runer112

  • Project Author
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2289
  • Rating: +639/-31
    • View Profile
Re: The Optimization Compilation
« Reply #79 on: February 13, 2011, 11:59:10 pm »
If you are pixel testing a constant pixel, like pxl-Test(20,20), you can more than halve the speed of this command with the following optimization:
Code: [Select]
{20*12+L6+2}e4

Remember, storing or recalling two-byte values at a constant address is always smaller and faster than storing or recalling one-byte values!
Code: [Select]
{20*12+L₆+2}ʳe4

Is using Rect( to draw straight horizontal/vertical lines faster than Line( ?

Definitely. Here are the results I got for vertical lines. Also note the 3 bytes saved in the Rect() arguments with some nice hl "abuses." :P
  • Line(A,0,A,63): ~6132 cycles
  • Rect(A,0,+1,-2): ~3354 cycles

The results are even more profound for horizontal lines. Also, 1 byte saved in the Rect() arguments.
  • Line(0,A,95,A): 9809 cycles
  • Rect(0,A,-1,+2): 2579 cycles

But if you really want a blazingly fast horizontal line drawer in pure Axe, use the following. It has a check to make sure that the y-value is valid, and if so, the whole process of calling the routine, checking the y-value, and drawing the line takes only 497 cycles. You would call it with something like Ysub(HL). Also, note the trick employed to check that the y-value is less than 64. That saves a few bytes and cycles over a normal comparison by utilizing the fact that we want a value precisely in the 6-bit range. By simply resetting these 6 bits and leaving any other bits unchanged, any 6-bit values will become zero and any out of range values will become nonzero, resulting in easy checking.
Code: [Select]
Lbl HL
  →r₆
  ReturnIf and b11000000
  ᴇFFFF→{r₆*12+L₆}ʳ
  Fill(,10)
Return
« Last Edit: February 14, 2011, 01:32:16 am by Runer112 »

Offline NinjaKnight

  • LV2 Member (Next: 40)
  • **
  • Posts: 20
  • Rating: +2/-0
  • Hey! Look behind you! It's a -- (runs)
    • View Profile
Re: The Optimization Compilation
« Reply #80 on: February 14, 2011, 10:28:17 pm »
 O.O That's a saving of OVER 9000 CYCLES!
Ninja vs. Chuck Norris == n/0. Both end with the world blowing up.

"We could use up two eternities in learning all that is to be learned about our own world and the thousands of nations that have arisen and flourished and vanished from it. Mathematics alone would occupy me eight million years."
-Mark Twain

Offline squidgetx

  • Food.
  • CoT Emeritus
  • LV10 31337 u53r (Next: 2000)
  • *
  • Posts: 1881
  • Rating: +503/-17
  • rawr.
    • View Profile
Re: The Optimization Compilation
« Reply #81 on: February 16, 2011, 07:19:08 am »
Here's something really cool I found out yesterday:
DispGraphr  is faster than DispGraph. By more than 10000 cycles, too. It varies from calc to calc, but on mine its a full 15000 cycles faster O.o This means that if you're making a monochrome game, and you're not using the backbuffer....(at the cost of around 15 bytes or so) you can make a
saving of OVER 9000 CYCLES!
« Last Edit: February 16, 2011, 07:19:27 am by squidgetx »

Offline Deep Toaster

  • So much to do, so much time, so little motivation
  • Administrator
  • LV13 Extreme Addict (Next: 9001)
  • *************
  • Posts: 8217
  • Rating: +758/-15
    • View Profile
    • ClrHome
Re: The Optimization Compilation
« Reply #82 on: February 16, 2011, 09:26:08 am »
Here's something really cool I found out yesterday:
DispGraphr  is faster than DispGraph. By more than 10000 cycles, too. It varies from calc to calc, but on mine its a full 15000 cycles faster O.o This means that if you're making a monochrome game, and you're not using the backbuffer....(at the cost of around 15 bytes or so) you can make a
saving of OVER 9000 CYCLES!

Wait ... what? How did that work O.o




Offline Happybobjr

  • James Oldiges
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2325
  • Rating: +128/-20
  • Howdy :)
    • View Profile
Re: The Optimization Compilation
« Reply #83 on: February 16, 2011, 10:01:25 am »
Here's something really cool I found out yesterday:
DispGraphr  is faster than DispGraph. By more than 10000 cycles, too. It varies from calc to calc, but on mine its a full 15000 cycles faster O.o This means that if you're making a monochrome game, and you're not using the backbuffer....(at the cost of around 15 bytes or so) you can make a
saving of OVER 9000 CYCLES!

how? that's weird
School: East Central High School
 
Axe: 1.0.0
TI-84 +SE  ||| OS: 2.53 MP (patched) ||| Version: "M"
TI-Nspire    |||  Lent out, and never returned
____________________________________________________________

Offline Runer112

  • Project Author
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2289
  • Rating: +639/-31
    • View Profile
Re: The Optimization Compilation
« Reply #84 on: February 16, 2011, 01:21:46 pm »
I see you've been reading up on my Commands documentation, eh squidgetx? Yeah, that's an interesting thing I discovered when speed testing the display commands. On calculators like mine with the old, "good" screen drivers, the screen driver delay seems to be pretty low and constant from calculator to calculator. DispGraph could run just as fast or faster than DispGraphr on these calculators. However, due to inconsistencies with the screen drivers in newer units, the routine may run too fast for the driver on some calculators, causing display problems, so Quigibo had to add a portion of code to pause the routine until the driver says it is ready. However, this pause itself adds some overhead time, making the routine slower.

Quigibo, the DispGraphr routine doesn't have any throttling system in place, yet no problems have been reported with it on newer calculators. Could you just remove the throttling system from the DispGraph routine and add one or two time-wasting instructions to make each loop iteration take as many cycles as each DispGraphr loop iteration?


EDIT: Hmm I don't know if Quigibo reads this thread and would see that, so I'm probably going to post that in a major thread he reads or send him a message about that.
« Last Edit: February 16, 2011, 01:26:24 pm by Runer112 »

Offline Deep Toaster

  • So much to do, so much time, so little motivation
  • Administrator
  • LV13 Extreme Addict (Next: 9001)
  • *************
  • Posts: 8217
  • Rating: +758/-15
    • View Profile
    • ClrHome
Re: The Optimization Compilation
« Reply #85 on: February 16, 2011, 03:36:14 pm »
Weird, so basically DispGraphr has less of a delay than DispGraph, but it still works fine?

EDIT: Quigibo seems to read this occasionally (see the first post on this page). But posting in the Features Wishlist or something else would be a good idea, too, especially now that this thread is no longer in the Axe project subforum.
« Last Edit: February 16, 2011, 03:37:37 pm by Deep Thought »




Offline squidgetx

  • Food.
  • CoT Emeritus
  • LV10 31337 u53r (Next: 2000)
  • *
  • Posts: 1881
  • Rating: +503/-17
  • rawr.
    • View Profile
Re: The Optimization Compilation
« Reply #86 on: February 22, 2011, 04:03:28 pm »
Quote
A quick and small way to determine the sign of a value (better than >>0) is
Code: [Select]
EXP//32768. It will return -1 if the value is negative, and 0 if the value is 0 or positive

Also added how Text(30*256+20):Text("Text") is much smaller than Text(20,30,"Text")

Credits to Runer112.
« Last Edit: February 22, 2011, 04:04:10 pm by squidgetx »

Offline Freyaday

  • The One And Only Serial Time Killing Catboy-Capoeirista-Ballerino
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1970
  • Rating: +128/-15
  • I put on my robe and pixel hat...
    • View Profile
Re: The Optimization Compilation
« Reply #87 on: March 07, 2011, 12:43:26 am »
I'd like to point out that DispGraphr is not faster than DispGraph, it just uses less cycles. How can this be? DispGraphr requires a 6MHz clock, because the screen requires a ~10microsec delay between successive inputs, otherwise the screen displays random garbage. DispGraphr is faster than that @ Full speed, necessitating the slowdown. DispGraph, however, uses TI's own method to prevent this, an altermate display routine with the necessary delay built in.
In other news, Frey continues kicking unprecedented levels of ass.
Proud member of LF#N--Lolis For #9678B6 Names


I'm a performer at heart; I stole it last week.
My Artwork!

Offline Runer112

  • Project Author
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2289
  • Rating: +639/-31
    • View Profile
Re: The Optimization Compilation
« Reply #88 on: March 07, 2011, 12:54:32 am »
What do you mean DispGraphr is not faster? Using fewer cycles at the same clock speed pretty much defines it as being faster. Although the actual byte retrieval, calculations, and outputting to the screen take more cycles, it doesn't have the safety checks built-in that DispGraph does, which slow the DispGraph routine down to being slower than the DispGraphr routine.
« Last Edit: March 07, 2011, 01:00:43 am by Runer112 »

Offline Freyaday

  • The One And Only Serial Time Killing Catboy-Capoeirista-Ballerino
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1970
  • Rating: +128/-15
  • I put on my robe and pixel hat...
    • View Profile
Re: The Optimization Compilation
« Reply #89 on: March 07, 2011, 01:42:40 am »
I'm trying to explain that the two commands are of roughly equal speed when the use of the Full and Normal commands are taken into account, and constantly switching between the two clock speeds so as not to slow down the rest of the program probably isn't what the compatability mode was meant for. Add to that the time of having to store to both buffers, and the advantages of DispGraphr kinda get negated.
That said, if the program is intended for a 6MHz calc, then use r.
« Last Edit: March 07, 2011, 01:46:12 am by Freyaday »
In other news, Frey continues kicking unprecedented levels of ass.
Proud member of LF#N--Lolis For #9678B6 Names


I'm a performer at heart; I stole it last week.
My Artwork!