Omnimaga: The Coders Of Tomorrow
Welcome, Guest. Please login or register.
 
Omnimaga: The Coders Of Tomorrow
25 May, 2013, 17:35:00 *
Welcome, Guest. Please login or register.

Login with username, password and session length
 
   home   news downloads projects tutorials misc forums rules new posts irc about Login Register  
+-OmnomIRC

You must Register, be logged in and have at least 40 posts to use this shout-box! If it still doesn't show up afterward, it might be that OmnomIRC is disabled for your group or under maintenance.

Note: You can also use an IRC client like mIRC, X-Chat or Mibbit to connect to an EFnet server and #omnimaga.

Pages: 1 ... 14 15 [16] 17 18 ... 20   Go Down
  Print  
Author Topic: Assembly Programmers - Help Axe Optimize! -  (Read 20527 times) Bookmark and Share
0 Members and 1 Guest are viewing this topic.
Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: 21 May, 2013, 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« Reply #225 on: 12 July, 2011, 08:02:39 »
0

Awesome wow!  Yeah, forward djnz is about as rare as cpir.  Although I think calc84maniac's original 4 level routine used them as well but for a different purpose.

Also on the same subject, although you'll be the only one who knows what I'm talking about, all 12 DispGraph forms work perfectly now.
Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
Runer112
Project Author
LV10 31337 u53r (Next: 2000)
*
Online Online

Gender: Male
Last Login: Today at 17:32:55
Date Registered: 02 July, 2009, 06:38:05
Posts: 1680


Total Post Ratings: +493

View Profile
« Reply #226 on: 12 July, 2011, 14:48:57 »
0

Just checking, have you actually tested the routines out? Because I didn't actually test those routines I gave you, I just modeled them after some routines I knew worked and hoped these would still work as well. Tongue
Logged
Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: 21 May, 2013, 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« Reply #227 on: 12 July, 2011, 21:33:09 »
0

Yeah I tested everything.  One of them had a problem that I fixed with the buffer ordering being switched though.
Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
calc84maniac
Epic z80 roflpwner
Coder Of Tomorrow
LV11 Super Veteran (Next: 3000)
*
Offline Offline

Gender: Male
Last Login: Today at 16:59:06
Date Registered: 28 August, 2008, 05:09:05
Location: Right behind you.
Posts: 2735


Total Post Ratings: +373

View Profile
« Reply #228 on: 15 July, 2011, 05:31:27 »
0

Here's a peephole optimization suggestion: Keep track of whether the value in HL is a constant or not, and if so, what constant. For example, I have some code:

1
2
3
4
5
If condition
do stuff
Else
16->W
End
Obviously, after the Else, HL has to be 0. Thus the 16 can be reduced to a ld l,16 instead of ld hl,16. It might be possible to auto-optimize stuff like 1->A:2->B into 1->A+1->B, but you could always leave that to the user like usual.

Also, I found it a bit annoying that when I did something like If E<(96*256), the part in the parentheses wasn't reduced to a constant before doing the less-than operation. Could the look-ahead parsing be able to detect constants in parentheses?
Logged

"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman
Runer112
Project Author
LV10 31337 u53r (Next: 2000)
*
Online Online

Gender: Male
Last Login: Today at 17:32:55
Date Registered: 02 July, 2009, 06:38:05
Posts: 1680


Total Post Ratings: +493

View Profile
« Reply #229 on: 15 July, 2011, 05:34:17 »
0

Also, I found it a bit annoying that when I did something like If E<(96*256), the part in the parentheses wasn't reduced to a constant before doing the less-than operation. Could the look-ahead parsing be able to detect constants in parentheses?

This times a million.
Logged
ztrumpet
The Rarely Active One
LV13 Extreme Addict (Next: 9001)
*************
Offline Offline

Gender: Male
Last Login: 22 May, 2013, 03:10:30
Date Registered: 08 November, 2009, 21:10:12
Location: Michigan
Posts: 5687


Total Post Ratings: +360

View Profile
« Reply #230 on: 15 July, 2011, 05:35:45 »
0

Also, I found it a bit annoying that when I did something like If E<(96*256), the part in the parentheses wasn't reduced to a constant before doing the less-than operation. Could the look-ahead parsing be able to detect constants in parentheses?

This times a million.
This times a million and five.

Seriously, I thought Axe did this already.  Apparently not... so, please? Cheesy
« Last Edit: 15 July, 2011, 05:35:53 by ztrumpet » Logged

Runer112
Project Author
LV10 31337 u53r (Next: 2000)
*
Online Online

Gender: Male
Last Login: Today at 17:32:55
Date Registered: 02 July, 2009, 06:38:05
Posts: 1680


Total Post Ratings: +493

View Profile
« Reply #231 on: 25 July, 2011, 17:16:41 »
+2

Quigibo, you read my mind. I was about to make a post with code for commands that deal with archived variables to work with variables in RAM too, but you added that in Axe 1.0.2 before I could finish! However, I'll make a post anyways because my p_GetArc routine is smaller. Tongue I also have a few other things.




p_GetArc: 7 bytes smaller.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
p_GetArc:
.db __GetArcEnd-1-$
push de
MOV9TOOP1()
B_CALL(_ChkFindSym)
jr c,__GetArcFail
dec b
inc b
jr z,__GetArcRam
B_CALL(_IsFixedName)
ld hl,9
jr z,__GetArcName
__GetArcStatic:
ld l,12
and %00011111
jr z,__GetArcDone
cp l
jr z,__GetArcDone
ld l,14
jr __GetArcDone
__GetArcName:
add hl,de
bit 7,h
jr z,$+7
res 7,h
set 6,h
inc b
B_CALL(_LoadDEIndPaged)
ld d,0
inc e
inc e
__GetArcDone:
add hl,de
ex de,hl
__GetArcStore:
pop hl
ld (hl),e
inc hl
ld (hl),d
inc hl
ld (hl),b
ex de,hl
ret
__GetArcRam:
and %00011111
jr z,__GetArcStore
cp CplxObj
jr z,__GetArcStore
inc de
inc de
jr __GetArcStore
__GetArcFail:
ld hl,0
pop de
ret
__GetArcEnd:
       
   

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
p_GetArc:
.db __GetArcEnd-1-$
push de
MOV9TOOP1()
B_CALL(_ChkFindSym)
jr c,__GetArcFail
dec b
inc b
jr z,__GetArcRam
B_CALL(_IsFixedName)
ld hl,9
jr z,__GetArcName
ld l,12
__GetArcChkFloat:
and %00011111
jr z,__GetArcDone
cp CplxObj
jr z,__GetArcDone
inc l
inc l
jr __GetArcDone
__GetArcName:
add hl,de
bit 7,h
jr z,$+7
res 7,h
set 6,h
inc b
B_CALL(_LoadDEIndPaged)
ld d,0
inc e
inc e
__GetArcDone:
add hl,de
ex de,hl
pop hl
ld (hl),e
inc hl
ld (hl),d
inc hl
ld (hl),b
ex de,hl
ret
__GetArcRam:
ld h,b
ld l,b
jr __GetArcChkFloat
__GetArcFail:
ld hl,0
pop de
ret
__GetArcEnd:
       




p_ReadArc: Bumping an old request for larger but drastically faster archive reading routines. The routines would need to modified slightly to allow for reading from RAM as well, but that should be no problem. I would understand if you didn't want to add the app version, but the program version is immensely better in my opinion.

And on the topic of stuff that involves port 6, I think it would be nice if the archive byte reading routine avoided using a B_CALL for a massive speed boost, especially for code compiled as programs:

p_ReadArc: 18 bytes (2x) larger, but ~1400 cycles (!!!10x!!!) faster


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
p_ReadArc:
.db __ReadArcEnd-1-$
ld c,a
in a,(6)
ld b,a
ld a,h
set 6,h
res 7,h
rlca
rlca
dec a
and %00000011
add a,c
out (6),a
ld c,(hl)
inc hl
bit 7,h
jr z,__ReadArcNoBoundary
set 6,h
res 7,h
inc a
out (6),a
__ReadArcNoBoundary:
ld l,(hl)
ld h,c
ld a,b
out (6),a
ret
__ReadArcEnd:

p_ReadArcApp: 36 bytes (3x) larger, but ~1050 cycles (4x) faster


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
p_ReadArcApp:
.db __ReadArcAppEnd-1-$
push hl
ld hl,$0000
ld de,ramCode
ld bc,__ReadArcAppRamCodeEnd-__ReadArcAppRamCode
ldir
pop hl
ld e,a
ld c,6
in b,(c)
ld a,h
set 6,h
res 7,h
rlca
rlca
dec a
and %00000011
add a,e
call ramCode
ld e,d
inc hl
bit 7,h
jr z,__ReadArcAppNoBoundary
set 6,h
res 7,h
inc a
__ReadArcAppNoBoundary:
call ramCode
ex de,hl
ret
__ReadArcAppEnd:
.db rp_Ans,__ReadArcAppEnd-p_ReadArcApp-3

__ReadArcAppRamCode:
out (6),a
ld d,(hl)
out (c),b
ret
__ReadArcAppRamCodeEnd:




p_CopyArc: Modified to allow for sources in RAM.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
p_CopyArc:
.db __CopyArcEnd-1-$
pop ix
pop de
ex (sp),hl
ld b,a
ld a,h
rlca
rlca
dec a
and %00000011
add a,b
set 6,h
res 7,h
pop bc
B_CALL(_FlashToRAM)
jp (ix)
__CopyArcEnd:
       
   

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
p_CopyArc:
.db __CopyArcEnd-1-$
ex (sp),hl
pop bc
pop de
ex (sp),hl
or a
jr z,__CopyArcRam
push bc
ld b,a
ld a,h
rlca
rlca
dec a
and %00000011
add a,b
set 6,h
res 7,h
pop bc
B_CALL(_FlashToRAM)
ret
__CopyArcRam:
ldir
ret
__CopyArcEnd:
       




Also, I'm not sure why I just realized this now, but why don't the 8-bit logic operations on variables just load the variable into a instead of de to save 2 bytes?



« Last Edit: 25 July, 2011, 20:20:48 by Runer112 » Logged
Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: 21 May, 2013, 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« Reply #232 on: 26 July, 2011, 12:04:13 »
0

Hmm, I'm still not sure if the extra speed is worth the size increase.  I guess a new argument for the speed is to make file reads more consistent (a program using a file from archive might run slower than one reading from ram). But I will put this up in the poll since I'd like to know how may people this would benefit or hurt.

The 8-bit logical operators I don't do that optimization because then I'd need duplicate commands and have even more special casing.  This is something that can easily be peephole optimized in the future however so it might become a non-issue.

I was trying to recursively parse constants in parenthesis in the last update, but it was extremely complicated so I gave up.  I will have to modify the core number reading system to get it to work (which I was planning to do eventually anyway) so I will get to it then.
Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: 21 May, 2013, 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« Reply #233 on: 27 July, 2011, 20:47:08 »
0

Wait.... Runer!  What were you thinking!  The App code for file reading can be the same as the program code, but just use port 7 instead of port 6 and set the high bits of hl for the $8000-$BFFF range.  That's what the Axe app does. Smiley

EDIT: Also, another thing that the routines would need is to disable interrupts and then restore them afterwards... which I can use the "Safety" code for, but its going to be slower and even larger.
« Last Edit: 27 July, 2011, 20:49:59 by Quigibo » Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
Runer112
Project Author
LV10 31337 u53r (Next: 2000)
*
Online Online

Gender: Male
Last Login: Today at 17:32:55
Date Registered: 02 July, 2009, 06:38:05
Posts: 1680


Total Post Ratings: +493

View Profile
« Reply #234 on: 28 July, 2011, 04:47:25 »
0

The app version might need to disable interrupts, but why would the program version need to? Both Axe's and the OS's interrupt handlers back up the page in the $4000-$7FFF bank and restore it upon returning.
Logged
Xeda112358
Xombie. I am it.
Coder Of Tomorrow
LV12 Extreme Poster (Next: 5000)
*
Offline Offline

Last Login: 23 May, 2013, 22:01:23
Date Registered: 31 October, 2010, 08:46:36
Location: Land of Little Cubes and Tea, NY
Posts: 3760


Total Post Ratings: +610

View Profile
« Reply #235 on: 11 August, 2011, 19:10:51 »
0

I was toying around with some math routines while I was away and I was curious about the square root algorithms. Are the designed to return the square root rounded down, up, or just rounded? If it is rounded down and you want to round it to the nearest integer answer, here is a code I made a while ago (it isn't even close to what Axe needs, but it should only be taken as an example):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
;===============================================================
sqrtE:
;===============================================================
;Input:
;     E is the value to find the square root of
;Outputs:
;     A is E-D^2
;     B is 0
;     D is the rounded result
;     E is not changed
;     HL is not changed
;Destroys:
;     C
;
        xor a               ;1      4         4
        ld d,a              ;1      4         4
        ld c,a              ;1      4         4
        ld b,4              ;2      7         7
sqrtELoop:
        rlc d               ;2      8        32
        ld c,d              ;1      4        16
        scf                 ;1      4        16
        rl c                ;2      8        32

        rlc e               ;2      8        32
        rla                 ;1      4        16
        rlc e               ;2      8        32
        rla                 ;1      4        16

        cp c                ;1      4        16
        jr c,$+4            ;4    12|15      48+3x
          inc d             ;--    --        --
          sub c             ;--    --        --
        djnz sqrtELoop      ;2    13|8       47
        cp d                ;1      4         4
        jr c,$+3            ;3    12|11     12|11
          inc d             ;--    --        --
        ret                 ;1     10        10
;===============================================================
;Size  : 29 bytes
;Speed : 347+3x cycles plus 1 if rounded down
;   x is the number of set bits in the result.
;===============================================================

The only reason that I mention this is that I know a lot of graphical algorithms would have better results if the square root was returned in rounded form as opposed to just rounded up or down.

Sorry if this was already covered and I missed it Undecided
Spoiler for Hidden:
EDIT:
Wow, I just did something I didn't think was even possible. I found a good use for a forward djnz. I must have had too much radiation for breakfast...
>.> Hehe, I use forward djnz in many-- if not most-- of my programs... It is one of the most useful tricks I use and is kind of my signature touch Smiley I use it to save time and memory a lot, especially in instances like this:

1
2
3
4
5
6
7
8
9
10
11
     ld b,a
     or a \ jr nz,Next1
       ;code
Next1:
     djnz Next2
       ;code
Next2:
     djnz Next3
       ;code
;...et cetera
* Xeda112358 loves it
« Last Edit: 11 August, 2011, 20:29:55 by Xeda112358 » Logged



Grammer Download (2.29.04.12)
Latest update (possibly incomplete)
My pastebin
Spoiler for FileSyst:
FileSyst is an application that provides a folder and filesystem for the TI-83+/84+ calculators. It is designed to be easy to access and use in BASIC, and it can be used to access game files and save data, or to create a command prompt, among other things:

Spoiler for Graphiti:
This is a graph explorer for graph theory. It will require lots of work to finish. Currently you can:
Add/delete vertices
Add edges (direction not shown, but they are directed)
Arrange vertices in a circle (in the future, you will be able to define levels of rings and the number of nodes in each)
Create complete graphs quickly

Plans:
Add adjacency matrix viewer
Deleting edges
Multiple graphs support
Arrows for directed graphs
Planarity testing
Matrix operations
Weighted edges
Chromatic polynomials
Chromatic numbers

Spoiler for Stats:

Samocal             [o---------]
Virtual Processor   [o---------]
EnG                 [oo--------]
Grammer             [ooo-------]
AsmComp             [ooo-------]
Partex              [oooo------]
BatLib              [oooooooo--]
Grammer82           [----------]
Grammer68000        [----------]


Pseudonyms:  Zeda, Xeda, Thunderbolt
Languages:   English, français
Programming: z80 Assmebly
             Grammer
             TI-BASIC (83/84/+/SE, 89/89t/92)
Known For:   -Creator of the Grammer programming language
              (Winning program of zContest2011)
             -BatLib- One of the most feature packed libraries for BASIC programmers available
              with over 100 functions and a simple programming language
             -Learning to program z80 in hexadecimal before using an assembler (no computer was
              available!)
╔═╦╗░╠═╬╣▒║ ║║▓╚═╩╝█


Runer112
Project Author
LV10 31337 u53r (Next: 2000)
*
Online Online

Gender: Male
Last Login: Today at 17:32:55
Date Registered: 02 July, 2009, 06:38:05
Posts: 1680


Total Post Ratings: +493

View Profile
« Reply #236 on: 11 August, 2011, 22:29:36 »
0

All of Axe math simply truncates, so I think the current square root algorithm is pretty good. Anyways you have to remember that Axe uses 16-bit math and that's an 8-bit square root function. Tongue
Logged
Xeda112358
Xombie. I am it.
Coder Of Tomorrow
LV12 Extreme Poster (Next: 5000)
*
Offline Offline

Last Login: 23 May, 2013, 22:01:23
Date Registered: 31 October, 2010, 08:46:36
Location: Land of Little Cubes and Tea, NY
Posts: 3760


Total Post Ratings: +610

View Profile
« Reply #237 on: 17 August, 2011, 01:39:41 »
0

Yeah, I know, but I just wanted to give an example. It is really only the last few bytes that are important, though, and I wanted to give a simple, easy to follow example. Also, great job with the optimisations Cheesy I wish I could help more, but most of the codes are a bit beyond my optimisation abilities.
Logged



Grammer Download (2.29.04.12)
Latest update (possibly incomplete)
My pastebin
Spoiler for FileSyst:
FileSyst is an application that provides a folder and filesystem for the TI-83+/84+ calculators. It is designed to be easy to access and use in BASIC, and it can be used to access game files and save data, or to create a command prompt, among other things:

Spoiler for Graphiti:
This is a graph explorer for graph theory. It will require lots of work to finish. Currently you can:
Add/delete vertices
Add edges (direction not shown, but they are directed)
Arrange vertices in a circle (in the future, you will be able to define levels of rings and the number of nodes in each)
Create complete graphs quickly

Plans:
Add adjacency matrix viewer
Deleting edges
Multiple graphs support
Arrows for directed graphs
Planarity testing
Matrix operations
Weighted edges
Chromatic polynomials
Chromatic numbers

Spoiler for Stats:

Samocal             [o---------]
Virtual Processor   [o---------]
EnG                 [oo--------]
Grammer             [ooo-------]
AsmComp             [ooo-------]
Partex              [oooo------]
BatLib              [oooooooo--]
Grammer82           [----------]
Grammer68000        [----------]


Pseudonyms:  Zeda, Xeda, Thunderbolt
Languages:   English, français
Programming: z80 Assmebly
             Grammer
             TI-BASIC (83/84/+/SE, 89/89t/92)
Known For:   -Creator of the Grammer programming language
              (Winning program of zContest2011)
             -BatLib- One of the most feature packed libraries for BASIC programmers available
              with over 100 functions and a simple programming language
             -Learning to program z80 in hexadecimal before using an assembler (no computer was
              available!)
╔═╦╗░╠═╬╣▒║ ║║▓╚═╩╝█


Runer112
Project Author
LV10 31337 u53r (Next: 2000)
*
Online Online

Gender: Male
Last Login: Today at 17:32:55
Date Registered: 02 July, 2009, 06:38:05
Posts: 1680


Total Post Ratings: +493

View Profile
« Reply #238 on: 30 August, 2011, 08:45:57 »
+2

I think I just performed the most ridiculous, impressive optimization I've ever performed on an Axe command. 27 bytes optimized down to 60% of its size: 17 bytes! w00t


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
p_DKeyVar:
.db __DKeyVarEnd-1-$
dec l
ld a,l
rra
rra
rra
and %00000111
inc a
ld b,a
ld a,%01111111
rlca
djnz $-1
ld h,a
ld a,l
and %00000111
inc a
ld b,a
ld a,%10000000
rlca
djnz $-1
ld l,a
ret
__DKeyVarEnd:
       
   

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
p_DKeyVar:
.db __DKeyVarEnd-1-$
ld a,l
ld hl,%0111111111110111
rlc h
adc a,l
jr c,$-3
ld l,%0000001
rrc l
inc a
jr nz,$-3
ret
__DKeyVarEnd:
       
« Last Edit: 30 August, 2011, 08:53:45 by Runer112 » Logged
Quigibo
The Executioner
LV11 Super Veteran (Next: 3000)
***********
Offline Offline

Gender: Male
Last Login: 21 May, 2013, 02:03:21
Date Registered: 22 January, 2010, 05:02:37
Location: Los Angeles
Posts: 2022


Topic starter
Total Post Ratings: +1019

View Profile
« Reply #239 on: 30 August, 2011, 09:02:02 »
0

O_O  I don't even understand what's going on here.  That's quite impressive!

EDIT: Also, a really obvious optimization I just noticed is that the return should be replaced by a jump to the direct key command so it doesn't have to return and re-call it.  Tongue
« Last Edit: 30 August, 2011, 09:06:23 by Quigibo » Logged

___Axe_Parser___
Today the calculator, tomorrow the world!
Pages: 1 ... 14 15 [16] 17 18 ... 20   Go Up
  Print  
 
Jump to:  

Powered by EzPortal
Powered by MySQL Powered by SMF 1.1.18 | SMF © 2013, Simple Machines Powered by PHP
Page created in 0.398 seconds with 32 queries.
Skin by DJ Omnimaga edited from SMF default theme with the help of tr1p1ea.
All programs, games and songs avaliable on this website are property of their respective owners.
Best viewed in Opera, Firefox, Chrome and Safari with a resolution of 1024x768 or above.