Omnimaga: The Coders Of Tomorrow
Welcome, Guest. Please login or register.
 
Omnimaga: The Coders Of Tomorrow
22 May, 2013, 18:59:48 *
Welcome, Guest. Please login or register.

Login with username, password and session length
 
   home   news downloads projects tutorials misc forums rules new posts irc about Login Register  
+-OmnomIRC

You must Register, be logged in and have at least 40 posts to use this shout-box! If it still doesn't show up afterward, it might be that OmnomIRC is disabled for your group or under maintenance.

Note: You can also use an IRC client like mIRC, X-Chat or Mibbit to connect to an EFnet server and #omnimaga.

Pages: [1] 2 3 ... 20   Go Down
  Print  
Author Topic: Assembly Programmers - Help Axe Optimize! -  (Read 20498 times) Bookmark and Share
0 Members and 1 Guest are viewing this topic.
Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: Yesterday at 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« on: 26 February, 2010, 22:52:51 »
0

I'm going to post most of the assembly routines I use in Axe Parser here to see if any of you asm programmers can help me with their optimizations.
  • I am always trying to optimize for size, not speed, unless it is significantly faster and close to the same size.
  • I would greatly prefer that it does not use any extra RAM to store temporary variables, but  that's not that big of a deal.
  • No self modifying code or undocumented commands because I need to make this compatible for Apps and with the Nspire TI-84 emulator.

The most important thing right now is the clipped sprite routine since its really big.  Here's what I got so far:

p_DrawOr8x8:
   push   hl
   pop   ix      ;Input hl = Sprite
   ld   b,7      ;Input c = Sprite X Position
   ld   d,0      ;Input e = Sprite Y Position
   ld   h,d
   ld   a,c
   add   a,b
   jr   c,__ClipLeft
   sub   96+7
   ret   nc
   cpl
   cp   b
   jr   nc,__NoClipH
__ClipRight:
   inc   d
   jr   __ClipHDone
__ClipLeft:
   add   a,89
   ld   c,a
__ClipHDone:
   inc   d      ;d,c,e are updated
__NoClipH:
   ld   a,e
   add   a,b
   jr   c,__ClipTop
   sub   64+7
   ret   nc
   cpl
   cp   b
   jr   nc,__NoClipV
   jr   __ClipBottom
__ClipTop:
   inc   ix
   inc   e
   jr   nz,__ClipTop
__ClipBottom:
   ld   b,a
__NoClipV:         ;b,ix,e are updated.
   dec   d
   jr   z,__NoFix
   inc   e
__NoFix:
   push   de
   ld   l,e
   ld   d,h
   add   hl,hl
   add   hl,de
   add   hl,hl
   add   hl,hl
   ld   e,c
   ld   a,e
   srl   e
   srl   e
   srl   e
    add   hl,de
   ld   de,plotSScreen-11
   add   hl,de
   pop   de
   inc   b
    and   %00000111
   jr   z,__DrawOr8x8Aligned
   ld   c,a
__DrawOr8x8Loop:
   push   bc
   ld   b,c
   ld   c,(ix+0)
   xor   a
__DrawOr8x8Shift:
   srl   c
   rra
   djnz   __DrawOr8x8Shift
   dec   d
   jr   z,__SkipRight
   or   (hl)
   ld   (hl),a
__SkipRight:
   dec   hl
   inc   d
   jr   z,__SkipLeft
   ld   a,c
   or   (hl)
   ld   (hl),a
__SkipLeft:
   ld   c,13
   add   hl,bc
   inc   ix
   pop   bc
   djnz   __DrawOr8x8Loop
   ret
__DrawOr8x8Aligned:
   dec   hl
   ld   de,12
__DrawOr8x8AlignedLoop:
   ld   a,(ix+0)
   or   (hl)
   ld   (hl),a
   inc   ix
   add   hl,de
   djnz   __DrawOr8x8AlignedLoop
   ret
__DrawOr8x8End:


If you spot anything that can be optimized, bold it so I can see what you changed, thanks!
Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
DJ Omnimaga
Retired Omnimaga founder (Site issues must be PM'ed to Netham45, Eeems, Shmibs, Deep Thought and AngelFish, not me.)
Editor
LV15 Omnimagician (Next: --)
*
Offline Offline

Gender: Male
Last Login: Today at 15:15:35
Date Registered: 25 August, 2008, 07:00:21
Location: Québec (Canada)
Posts: 50216


Total Post Ratings: +2615

View Profile WWW
« Reply #1 on: 26 February, 2010, 23:57:59 »
0

I would prefer no extra RAM usage at all. Otherwise, games may not run on any TI-84+ manufactured after April 2007 and will not be compatible with the regular 83+, meaning a considerable drop in the author's audience.
« Last Edit: 26 February, 2010, 23:58:20 by DJ Omnimaga » Logged

Retired 83+ coder, Omnimaga/TIMGUL founder. Now doing power metal music (formerly did electronica)

Follow me on Bandcamp|Facebook|Reverbnation|Youtube|Twitter|Myspace
Galandros
LV9 Veteran (Next: 1337)
*********
Offline Offline

Last Login: 27 March, 2011, 01:13:41
Date Registered: 18 October, 2008, 14:21:07
Location: dead end of Europe
Posts: 1150

Total Post Ratings: +32

View Profile
« Reply #2 on: 27 February, 2010, 00:09:49 »
0

I would prefer no extra RAM usage at all. Otherwise, games may not run on any TI-84+ manufactured after April 2007 and will not be compatible with the regular 83+, meaning a considerable drop in the author's audience.
Generally ram usage is just the routine needs some temporary bytes to store data. (bytes in the program itself or inserted in the TI-OS available RAM or free ram zones of the TI-OS) Only when you need a good amount of memory you use the extra ram pages.
Logged

Hobbing in calculator projects.
Eeems
THE GAME
Administrator
LV13 Extreme Addict (Next: 9001)
*
Offline Offline

Gender: Male
Last Login: Today at 17:14:57
Date Registered: 14 March, 2009, 03:32:57
Location: Edmonton, Alberta
Posts: 5075


Total Post Ratings: +230

View Profile WWW
« Reply #3 on: 27 February, 2010, 01:14:01 »
0

Aldo if I'm not wrong there is a little extra ram in the newer calcs, but not that much.
Logged

DJ Omnimaga
Retired Omnimaga founder (Site issues must be PM'ed to Netham45, Eeems, Shmibs, Deep Thought and AngelFish, not me.)
Editor
LV15 Omnimagician (Next: --)
*
Offline Offline

Gender: Male
Last Login: Today at 15:15:35
Date Registered: 25 August, 2008, 07:00:21
Location: Québec (Canada)
Posts: 50216


Total Post Ratings: +2615

View Profile WWW
« Reply #4 on: 27 February, 2010, 01:21:02 »
0

Yeah. I'm not sure if it's accessed the same way, though.
Logged

Retired 83+ coder, Omnimaga/TIMGUL founder. Now doing power metal music (formerly did electronica)

Follow me on Bandcamp|Facebook|Reverbnation|Youtube|Twitter|Myspace
Galandros
LV9 Veteran (Next: 1337)
*********
Offline Offline

Last Login: 27 March, 2011, 01:13:41
Date Registered: 18 October, 2008, 14:21:07
Location: dead end of Europe
Posts: 1150

Total Post Ratings: +32

View Profile
« Reply #5 on: 27 February, 2010, 02:11:58 »
0

In the new calcs only the page 53h is still there. It is same port as before. Dunno what 3rd party software uses it...
Logged

Hobbing in calculator projects.
calc84maniac
Epic z80 roflpwner
Coder Of Tomorrow
LV11 Super Veteran (Next: 3000)
*
Offline Offline

Gender: Male
Last Login: 20 May, 2013, 21:27:24
Date Registered: 28 August, 2008, 05:09:05
Location: Right behind you.
Posts: 2735


Total Post Ratings: +373

View Profile
« Reply #6 on: 27 February, 2010, 03:43:34 »
0

In the new calcs only the page 53h is still there. It is same port as before. Dunno what 3rd party software uses it...
Actually, we don't know which one is still there. All we know is that pages $82-$87 appear to be the same 16K of physical memory.
Logged

"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman
ztrumpet
The Rarely Active One
LV13 Extreme Addict (Next: 9001)
*************
Offline Offline

Gender: Male
Last Login: Today at 03:10:30
Date Registered: 08 November, 2009, 21:10:12
Location: Michigan
Posts: 5687


Total Post Ratings: +360

View Profile
« Reply #7 on: 27 February, 2010, 17:28:07 »
0

In the new calcs only the page 53h is still there. It is same port as before. Dunno what 3rd party software uses it...
If I'm not mistaken, I think Omnicalc's Quick Apps uses it and it works fine on a new 84+se.
Logged

Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: Yesterday at 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« Reply #8 on: 28 February, 2010, 04:23:58 »
0

Safe Copy requires the undocumented instruction:
  in f,(c)

Will that be compatible with the Nspire?  If anyone has one, could you please try adding this to any Axe Parser code and see if it crashes:

Asm(0E10ED70)
Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
calc84maniac
Epic z80 roflpwner
Coder Of Tomorrow
LV11 Super Veteran (Next: 3000)
*
Offline Offline

Gender: Male
Last Login: 20 May, 2013, 21:27:24
Date Registered: 28 August, 2008, 05:09:05
Location: Right behind you.
Posts: 2735


Total Post Ratings: +373

View Profile
« Reply #9 on: 28 February, 2010, 05:21:03 »
0

Safe Copy requires the undocumented instruction:
  in f,(c)

Will that be compatible with the Nspire?  If anyone has one, could you please try adding this to any Axe Parser code and see if it crashes:

Asm(0E10ED70)
It won't be compatible. But it doesn't require that instruction anyway. I always do this:

1
2
3
in a,($10)
rla
jr c,$-3 ;or was it "nc"?
Logged

"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman
DJ Omnimaga
Retired Omnimaga founder (Site issues must be PM'ed to Netham45, Eeems, Shmibs, Deep Thought and AngelFish, not me.)
Editor
LV15 Omnimagician (Next: --)
*
Offline Offline

Gender: Male
Last Login: Today at 15:15:35
Date Registered: 25 August, 2008, 07:00:21
Location: Québec (Canada)
Posts: 50216


Total Post Ratings: +2615

View Profile WWW
« Reply #10 on: 28 February, 2010, 06:58:30 »
0

Hopefully, though, maybe someone will write a 84+ emu for the Nspire to replace the current one, now that Ndless is out Tongue
Logged

Retired 83+ coder, Omnimaga/TIMGUL founder. Now doing power metal music (formerly did electronica)

Follow me on Bandcamp|Facebook|Reverbnation|Youtube|Twitter|Myspace
Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: Yesterday at 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« Reply #11 on: 14 March, 2010, 09:23:09 »
0

Does anyone know an efficient way to do signed multiplication for a 2's compliment system?  I can't find any tutorials on the internet.  My naive method is to remove and keep track of the sign bit for each term, multiply the positive versions together, and then add the new sign bit.  Is there a better method?

This is what I'm using to get the sign bit out of de and keep track of it with the 'b' register.  b starts at zero, so I can use bit 0 of b as the new sign bit if I repeat this for hl.


1
2
3
4
5
6
7
8
9
10
11
bit 7,d
jr z,__MulSNotNeg
inc b
xor a
sub e
ld e,a
sbc a,a
sub d
ld d,a
__MulSNotNeg:

I would have to do the same thing for hl and then again at the end when I need to put the bit back so it seems like a lot of extra code...
« Last Edit: 14 March, 2010, 09:26:25 by Quigibo » Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
calc84maniac
Epic z80 roflpwner
Coder Of Tomorrow
LV11 Super Veteran (Next: 3000)
*
Offline Offline

Gender: Male
Last Login: 20 May, 2013, 21:27:24
Date Registered: 28 August, 2008, 05:09:05
Location: Right behind you.
Posts: 2735


Total Post Ratings: +373

View Profile
« Reply #12 on: 14 March, 2010, 15:38:51 »
0

No need at all. 16-bit * 16-bit -> 16-bit will give the same result for unsigned and signed arithmetic. Problem solved! Wink

Seriously, try multiplying some signed values in your parser and you will get the right results.

Edit:
Scratch that, I just tried it myself. What is your normal multiplication routine?

Edit2:
I just disassembled it, and I think you are only doing an 8-bit * 16-bit multiplication. That could explain the bad outputs.
« Last Edit: 14 March, 2010, 15:49:09 by calc84maniac » Logged

"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman
Galandros
LV9 Veteran (Next: 1337)
*********
Offline Offline

Last Login: 27 March, 2011, 01:13:41
Date Registered: 18 October, 2008, 14:21:07
Location: dead end of Europe
Posts: 1150

Total Post Ratings: +32

View Profile
« Reply #13 on: 14 March, 2010, 16:17:13 »
0

I have some ideas but I don't know how well it will be implemented:
There are some shift instructions that preserve the sign bit, for example sra. And sla is other arithmetic shift... Probably this isn't useful or as fast as other methods.
When the multiplication is finished you can set the correct sign based in the inputs.
You can probably use the sign flag to optimize.

neg is equivalent to cpl / inc a. Dunno how this goes into 16-bit pair registers.
Logged

Hobbing in calculator projects.
calc84maniac
Epic z80 roflpwner
Coder Of Tomorrow
LV11 Super Veteran (Next: 3000)
*
Offline Offline

Gender: Male
Last Login: 20 May, 2013, 21:27:24
Date Registered: 28 August, 2008, 05:09:05
Location: Right behind you.
Posts: 2735


Total Post Ratings: +373

View Profile
« Reply #14 on: 14 March, 2010, 18:55:18 »
0

Here is your original multiplication routine:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
xor a
 or d
 jr nz,$+3
 ex de,hl
 ld a,l
 ld hl,0
_multloop:
 rra
 jr nc,$+3
 add hl,de
 sla e
 rl d
 or a
 jr nz,_multloop
 ret

Here is my modified signed version, only 8 extra bytes (all in the overhead). It multiplies two signed values, one of which is between -256 and 255.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
ld a,d
 rrca
 cp d
 jr nz,$+3
 ex de,hl
 xor a
 inc h
 jr nz,$+3
 sub e
 ld h,a
 ld a,l
 ld l,0
 or a ;Returns if multiplying by 0 or -256, also resets carry flag
 ret z
_multloop:
 rra
 jr nc,$+3
 add hl,de
 sla e
 rl d
 or a
 jr nz,_multloop
 ret
Logged

"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman
Pages: [1] 2 3 ... 20   Go Up
  Print  
 
Jump to:  

Powered by EzPortal
Powered by MySQL Powered by SMF 1.1.18 | SMF © 2013, Simple Machines Powered by PHP
Page created in 0.351 seconds with 30 queries.
Skin by DJ Omnimaga edited from SMF default theme with the help of tr1p1ea.
All programs, games and songs avaliable on this website are property of their respective owners.
Best viewed in Opera, Firefox, Chrome and Safari with a resolution of 1024x768 or above.