Author Topic: Assembly Programmers - Help Axe Optimize!  (Read 31845 times)

0 Members and 1 Guest are viewing this topic.

Offline Quigibo

  • The Executioner
  • CoT Emeritus
  • LV11 Super Veteran (Next: 3000)
  • *
  • Posts: 2032
  • Rating: +1075/-24
  • I wish real life had a "Save" and "Load" button...
    • View Profile
Assembly Programmers - Help Axe Optimize!
« on: February 26, 2010, 03:52:51 pm »
I'm going to post most of the assembly routines I use in Axe Parser here to see if any of you asm programmers can help me with their optimizations.
  • I am always trying to optimize for size, not speed, unless it is significantly faster and close to the same size.
  • I would greatly prefer that it does not use any extra RAM to store temporary variables, but  that's not that big of a deal.
  • No self modifying code or undocumented commands because I need to make this compatible for Apps and with the Nspire TI-84 emulator.

The most important thing right now is the clipped sprite routine since its really big.  Here's what I got so far:

p_DrawOr8x8:
   push   hl
   pop   ix      ;Input hl = Sprite
   ld   b,7      ;Input c = Sprite X Position
   ld   d,0      ;Input e = Sprite Y Position
   ld   h,d
   ld   a,c
   add   a,b
   jr   c,__ClipLeft
   sub   96+7
   ret   nc
   cpl
   cp   b
   jr   nc,__NoClipH
__ClipRight:
   inc   d
   jr   __ClipHDone
__ClipLeft:
   add   a,89
   ld   c,a
__ClipHDone:
   inc   d      ;d,c,e are updated
__NoClipH:
   ld   a,e
   add   a,b
   jr   c,__ClipTop
   sub   64+7
   ret   nc
   cpl
   cp   b
   jr   nc,__NoClipV
   jr   __ClipBottom
__ClipTop:
   inc   ix
   inc   e
   jr   nz,__ClipTop
__ClipBottom:
   ld   b,a
__NoClipV:         ;b,ix,e are updated.
   dec   d
   jr   z,__NoFix
   inc   e
__NoFix:
   push   de
   ld   l,e
   ld   d,h
   add   hl,hl
   add   hl,de
   add   hl,hl
   add   hl,hl
   ld   e,c
   ld   a,e
   srl   e
   srl   e
   srl   e
    add   hl,de
   ld   de,plotSScreen-11
   add   hl,de
   pop   de
   inc   b
    and   %00000111
   jr   z,__DrawOr8x8Aligned
   ld   c,a
__DrawOr8x8Loop:
   push   bc
   ld   b,c
   ld   c,(ix+0)
   xor   a
__DrawOr8x8Shift:
   srl   c
   rra
   djnz   __DrawOr8x8Shift
   dec   d
   jr   z,__SkipRight
   or   (hl)
   ld   (hl),a
__SkipRight:
   dec   hl
   inc   d
   jr   z,__SkipLeft
   ld   a,c
   or   (hl)
   ld   (hl),a
__SkipLeft:
   ld   c,13
   add   hl,bc
   inc   ix
   pop   bc
   djnz   __DrawOr8x8Loop
   ret
__DrawOr8x8Aligned:
   dec   hl
   ld   de,12
__DrawOr8x8AlignedLoop:
   ld   a,(ix+0)
   or   (hl)
   ld   (hl),a
   inc   ix
   add   hl,de
   djnz   __DrawOr8x8AlignedLoop
   ret
__DrawOr8x8End:


If you spot anything that can be optimized, bold it so I can see what you changed, thanks!
___Axe_Parser___
Today the calculator, tomorrow the world!

Offline DJ Omnimaga

  • Now active at https://codewalr.us
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55811
  • Rating: +3147/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • CodeWalrus
Re: Assembly Programmers - Help Axe Optimize!
« Reply #1 on: February 26, 2010, 04:57:59 pm »
I would prefer no extra RAM usage at all. Otherwise, games may not run on any TI-84+ manufactured after April 2007 and will not be compatible with the regular 83+, meaning a considerable drop in the author's audience.
« Last Edit: February 26, 2010, 04:58:20 pm by DJ Omnimaga »
In case you are wondering where I went, I am still regularly active in the TI community and am even making new TI-84 Plus CE games. However, I left Omnimaga back in March 2015 for various reasons explained here and there. I might come back one day, depending of certain circumstances, but my new TI community home (despite me being Omnimaga founder in 2001) is now CodeWalrus ( https://codewalr.us ). Sorry for the inconveniences.


Bandcamp|Reverbnation|Facebook|Youtube|Twitter
Retired Omnimaga admin (2001-11) and editor (2012-14)

Offline Galandros

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1140
  • Rating: +42/-10
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #2 on: February 26, 2010, 05:09:49 pm »
I would prefer no extra RAM usage at all. Otherwise, games may not run on any TI-84+ manufactured after April 2007 and will not be compatible with the regular 83+, meaning a considerable drop in the author's audience.
Generally ram usage is just the routine needs some temporary bytes to store data. (bytes in the program itself or inserted in the TI-OS available RAM or free ram zones of the TI-OS) Only when you need a good amount of memory you use the extra ram pages.
Hobbing in calculator projects.

Offline Eeems

  • Mr. Dictator
  • Administrator
  • LV13 Extreme Addict (Next: 9001)
  • *************
  • Posts: 6017
  • Rating: +314/-36
  • C'est la vie
    • View Profile
    • Eeems
Re: Assembly Programmers - Help Axe Optimize!
« Reply #3 on: February 26, 2010, 06:14:01 pm »
Aldo if I'm not wrong there is a little extra ram in the newer calcs, but not that much.
/e

Offline DJ Omnimaga

  • Now active at https://codewalr.us
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55811
  • Rating: +3147/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • CodeWalrus
Re: Assembly Programmers - Help Axe Optimize!
« Reply #4 on: February 26, 2010, 06:21:02 pm »
Yeah. I'm not sure if it's accessed the same way, though.
In case you are wondering where I went, I am still regularly active in the TI community and am even making new TI-84 Plus CE games. However, I left Omnimaga back in March 2015 for various reasons explained here and there. I might come back one day, depending of certain circumstances, but my new TI community home (despite me being Omnimaga founder in 2001) is now CodeWalrus ( https://codewalr.us ). Sorry for the inconveniences.


Bandcamp|Reverbnation|Facebook|Youtube|Twitter
Retired Omnimaga admin (2001-11) and editor (2012-14)

Offline Galandros

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1140
  • Rating: +42/-10
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #5 on: February 26, 2010, 07:11:58 pm »
In the new calcs only the page 53h is still there. It is same port as before. Dunno what 3rd party software uses it...
Hobbing in calculator projects.

Offline calc84maniac

  • Epic z80 roflpwner
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2883
  • Rating: +456/-17
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #6 on: February 26, 2010, 08:43:34 pm »
In the new calcs only the page 53h is still there. It is same port as before. Dunno what 3rd party software uses it...
Actually, we don't know which one is still there. All we know is that pages $82-$87 appear to be the same 16K of physical memory.
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline ztrumpet

  • The Rarely Active One
  • CoT Emeritus
  • LV13 Extreme Addict (Next: 9001)
  • *
  • Posts: 5714
  • Rating: +364/-4
  • If you see this, send me a PM. Just for fun.
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #7 on: February 27, 2010, 10:28:07 am »
In the new calcs only the page 53h is still there. It is same port as before. Dunno what 3rd party software uses it...
If I'm not mistaken, I think Omnicalc's Quick Apps uses it and it works fine on a new 84+se.

Offline Quigibo

  • The Executioner
  • CoT Emeritus
  • LV11 Super Veteran (Next: 3000)
  • *
  • Posts: 2032
  • Rating: +1075/-24
  • I wish real life had a "Save" and "Load" button...
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #8 on: February 27, 2010, 09:23:58 pm »
Safe Copy requires the undocumented instruction:
  in f,(c)

Will that be compatible with the Nspire?  If anyone has one, could you please try adding this to any Axe Parser code and see if it crashes:

Asm(0E10ED70)
___Axe_Parser___
Today the calculator, tomorrow the world!

Offline calc84maniac

  • Epic z80 roflpwner
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2883
  • Rating: +456/-17
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #9 on: February 27, 2010, 10:21:03 pm »
Safe Copy requires the undocumented instruction:
  in f,(c)

Will that be compatible with the Nspire?  If anyone has one, could you please try adding this to any Axe Parser code and see if it crashes:

Asm(0E10ED70)
It won't be compatible. But it doesn't require that instruction anyway. I always do this:
Code: [Select]
in a,($10)
rla
jr c,$-3 ;or was it "nc"?
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline DJ Omnimaga

  • Now active at https://codewalr.us
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55811
  • Rating: +3147/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • CodeWalrus
Re: Assembly Programmers - Help Axe Optimize!
« Reply #10 on: February 27, 2010, 11:58:30 pm »
Hopefully, though, maybe someone will write a 84+ emu for the Nspire to replace the current one, now that Ndless is out :P
In case you are wondering where I went, I am still regularly active in the TI community and am even making new TI-84 Plus CE games. However, I left Omnimaga back in March 2015 for various reasons explained here and there. I might come back one day, depending of certain circumstances, but my new TI community home (despite me being Omnimaga founder in 2001) is now CodeWalrus ( https://codewalr.us ). Sorry for the inconveniences.


Bandcamp|Reverbnation|Facebook|Youtube|Twitter
Retired Omnimaga admin (2001-11) and editor (2012-14)

Offline Quigibo

  • The Executioner
  • CoT Emeritus
  • LV11 Super Veteran (Next: 3000)
  • *
  • Posts: 2032
  • Rating: +1075/-24
  • I wish real life had a "Save" and "Load" button...
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #11 on: March 14, 2010, 03:23:09 am »
Does anyone know an efficient way to do signed multiplication for a 2's compliment system?  I can't find any tutorials on the internet.  My naive method is to remove and keep track of the sign bit for each term, multiply the positive versions together, and then add the new sign bit.  Is there a better method?

This is what I'm using to get the sign bit out of de and keep track of it with the 'b' register.  b starts at zero, so I can use bit 0 of b as the new sign bit if I repeat this for hl.

Code: [Select]
bit 7,d
jr z,__MulSNotNeg
inc b
xor a
sub e
ld e,a
sbc a,a
sub d
ld d,a
__MulSNotNeg:

I would have to do the same thing for hl and then again at the end when I need to put the bit back so it seems like a lot of extra code...
« Last Edit: March 14, 2010, 03:26:25 am by Quigibo »
___Axe_Parser___
Today the calculator, tomorrow the world!

Offline calc84maniac

  • Epic z80 roflpwner
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2883
  • Rating: +456/-17
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #12 on: March 14, 2010, 09:38:51 am »
No need at all. 16-bit * 16-bit -> 16-bit will give the same result for unsigned and signed arithmetic. Problem solved! ;)

Seriously, try multiplying some signed values in your parser and you will get the right results.

Edit:
Scratch that, I just tried it myself. What is your normal multiplication routine?

Edit2:
I just disassembled it, and I think you are only doing an 8-bit * 16-bit multiplication. That could explain the bad outputs.
« Last Edit: March 14, 2010, 09:49:09 am by calc84maniac »
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Galandros

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1140
  • Rating: +42/-10
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #13 on: March 14, 2010, 10:17:13 am »
I have some ideas but I don't know how well it will be implemented:
There are some shift instructions that preserve the sign bit, for example sra. And sla is other arithmetic shift... Probably this isn't useful or as fast as other methods.
When the multiplication is finished you can set the correct sign based in the inputs.
You can probably use the sign flag to optimize.

neg is equivalent to cpl / inc a. Dunno how this goes into 16-bit pair registers.
Hobbing in calculator projects.

Offline calc84maniac

  • Epic z80 roflpwner
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2883
  • Rating: +456/-17
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #14 on: March 14, 2010, 12:55:18 pm »
Here is your original multiplication routine:
Code: [Select]
xor a
 or d
 jr nz,$+3
 ex de,hl
 ld a,l
 ld hl,0
_multloop:
 rra
 jr nc,$+3
 add hl,de
 sla e
 rl d
 or a
 jr nz,_multloop
 ret

Here is my modified signed version, only 8 extra bytes (all in the overhead). It multiplies two signed values, one of which is between -256 and 255.
Code: [Select]
ld a,d
 rrca
 cp d
 jr nz,$+3
 ex de,hl
 xor a
 inc h
 jr nz,$+3
 sub e
 ld h,a
 ld a,l
 ld l,0
 or a ;Returns if multiplying by 0 or -256, also resets carry flag
 ret z
_multloop:
 rra
 jr nc,$+3
 add hl,de
 sla e
 rl d
 or a
 jr nz,_multloop
 ret
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman