### Author Topic: Assembly Programmers - Help Axe Optimize!  (Read 142083 times)

0 Members and 1 Guest are viewing this topic.

#### Runer112

• Project Author
• LV11 Super Veteran (Next: 3000)
• Posts: 2289
• Rating: +639/-31
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #75 on: September 22, 2010, 06:42:19 pm »
And how would the parser decide to do that

#### Quigibo

• The Executioner
• CoT Emeritus
• LV11 Super Veteran (Next: 3000)
• Posts: 2031
• Rating: +1075/-24
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #76 on: September 22, 2010, 06:48:29 pm »
The problem with conditional "short-circuit evaluation" is that it has to do a lot of non-linear "look-ahead" parsing to determine if it's okay to get out of the statement early or not.  You might for example have If A≠5 and sub(EQL,B,C) which might need to evaluate the second expression even if the first one is false.  The idea definitely sounds good though, but it seems like it would be really complicated for the compiler to tell whether or not it can actually use that optimization and be completely compatible with previous versions.  And even when it can, I would have to write completely new block code and assembly templates for those conditionals.
___Axe_Parser___
Today the calculator, tomorrow the world!

#### Runer112

• Project Author
• LV11 Super Veteran (Next: 3000)
• Posts: 2289
• Rating: +639/-31
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #77 on: October 20, 2010, 08:56:30 am »
Oops, necropost, oh well

I don't know if this approach was purposely left out, as it's 15 bytes larger than the current routine and sometimes slower. I'm referring to the square root routine. Whereas the current routine (14 bytes) takes 37n+38 T-states (linear time), where n is the result+1 (1-256), the following routine (29 bytes) takes 5n+800 T-states (near constant time), where n is the number of set bits in the result (0-8). The existing routine is faster for values that would yield results of 0-19, but this routine would be faster for values that would yield results of 20-255, which is a much broader range of the 8-bit spectrum. Also, it would be much more reliable to run at a near constant speed in programs which rely on that to run smoothly themselves. The existing routine would take only a few hundred T-states for low inputs, but would take up to OVER NINE THOUSAND T-states to calculate the square roots for the highest inputs. So it's up to you if this is something you want to use.

Code: [Select]
p_Sqrt: .db __SqrtEnd-1-$ld a,l ld l,h ld de,$0040 ld h,d ld b,8 or a__SqrtLoop: sbc hl,de jr nc,__SqrtSkip add hl,de__SqrtSkip: ccf rl d rla adc hl,hl rla adc hl,hl djnz __SqrtLoop ld h,0 ld l,d ret__SqrtEnd:
« Last Edit: October 20, 2010, 09:10:44 am by Runer112 »

#### Runer112

• Project Author
• LV11 Super Veteran (Next: 3000)
• Posts: 2289
• Rating: +639/-31
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #78 on: October 30, 2010, 08:11:33 pm »
I think it's been long enough that I can safely double post

Bit routine optimizations! Please tell me if any of these wouldn't work correctly, as I wrote them myself and I'm not a terribly experienced assembly programmer so that's a definite possibility.

 Code: (Current code) [Select]p_GetBit0: .db 5 ;5 bytes, 36 T-states add hl,hl ccf sbc hl,hl inc hlp_GetBit1: .db 6 ;6 bytes, 47 T-states add hl,hl add hl,hl ccf sbc hl,hl inc hlp_GetBit2: .db 7 ;7 bytes, 58 T-states add hl,hl add hl,hl add hl,hl ccf sbc hl,hl inc hlp_GetBit6: .db 7 ;7 bytes, 37 T-states ld a,h rra rra ccf sbc hl,hl inc hlp_GetBit7: .db 6 ;6 bytes, 33 T-states rr h ccf sbc hl,hl inc hlp_GetBit8: .db 6 ;6 bytes, 33 T-states rl l ccf sbc hl,hl inc hlp_GetBit9: .db 7 ;7 bytes, 37 T-states ld a,l rla rla ccf sbc hl,hl inc hlp_GetBit10: .db 8 ;8 bytes, 30/29 T-states bit 5,l ld hl,0 jr z,$+3 inc lp_GetBit14: .db 7 ;7 bytes, 37 T-states ld a,l rra rra ccf sbc hl,hl inc hlp_GetBit15: .db 6 ;6 bytes, 33 T-states rr l ccf sbc hl,hl inc hl  Code: (Optimized code) [Select]p_GetBit0: .db 5 ;5 bytes, 27 T-states xor a add hl,hl ld h,a rla ld l,ap_GetBit1: .db 6 ;6 bytes, 38 T-states xor a add hl,hl add hl,hl ld h,a rla ld l,ap_GetBit2: .db 7 ;7 bytes, 49 T-states xor a add hl,hl add hl,hl add hl,hl ld h,a rla ld l,ap_GetBit6: .db 7 ;7 bytes, 26 T-states ld a,%00000010 and h rrca ld h,0 ld l,ap_GetBit7: .db 6 ;6 bytes, 22 T-states ld a,%00000001 and h ld h,0 ld l,ap_GetBit8: .db 5 ;5 bytes, 27 T-states xor a ld h,a add hl,hl ld l,h ld h,ap_GetBit9: .db 6 ;6 bytes, 38 T-states xor a add hl,hl ld h,a add hl,hl ld l,h ld h,ap_GetBit10: .db 7 ;7 bytes, 49 T-states xor a add hl,hl add hl,hl ld h,a add hl,hl ld l,h ld h,ap_GetBit14: .db 7 ;7 bytes, 26 T-states ld a,%00000010 and l rrca ld h,0 ld l,ap_GetBit15: .db 5 ;5 bytes, 20 T-states xor a ld h,a inc a and l ld l,a  Other optimizations: • The signed less than zero comparison (p_SLT0) can be optimized to the optimized p_GetBit0 above. « Last Edit: October 30, 2010, 09:12:42 pm by Runer112 » #### Runer112 • Project Author • LV11 Super Veteran (Next: 3000) • Posts: 2289 • Rating: +639/-31 ##### Re: Assembly Programmers - Help Axe Optimize! « Reply #79 on: November 10, 2010, 02:38:37 am » Signed greater than comparison:  Code: (Current code) [Select]p_SIntGt: .db 13 ;13 bytes, 48 T-states ex de,hl xor a ld b,h sbc hl,de ld h,a rra xor b xor d rlca and 1 ld l,a Code: (Optimized code) [Select]p_SIntGt: .db 12 ;12 bytes, 67 T-states ld bc,$8000 add hl,bc ex de,hl add hl,bc xor a sbc hl,de ld h,a rla ld l,a

You getting all this Quigibo?
« Last Edit: November 10, 2010, 02:38:59 am by Runer112 »

#### DJ Omnimaga

• Clacualters are teh gr33t
• CoT Emeritus
• LV15 Omnimagician (Next: --)
• Posts: 55942
• Rating: +3154/-232
• CodeWalrus founder & retired Omnimaga founder
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #80 on: November 10, 2010, 02:40:58 am »
I think he is too busy, which might explain why he doesn't respond.
* DJ Omnimaga hopes his school schedule doesn't get so drastic that he gets forced to quit the community for good... I am not too worried about the future of Axe programming, though. I was worried that if her became less active, there would be less activity in his sub-forum since he replied to a lot of help topics, but then activity still continued. I guess a huge thank to you and a bunch of other people is in order. Sadly, having quit programming a while ago I did not really participate much, though X.x
« Last Edit: November 10, 2010, 02:42:45 am by DJ Omnimaga »
Now active at https://discord.gg/cuZcfcF (CodeWalrus server)

#### Quigibo

• The Executioner
• CoT Emeritus
• LV11 Super Veteran (Next: 3000)
• Posts: 2031
• Rating: +1075/-24
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #81 on: November 10, 2010, 07:55:16 pm »
Yeah, I'm still reading all of this, even though I'm less active, I still visit just about every day   I've even been able to do a little more progress with Axe even with my busy schedule.

Runer112, are you sure that comparison is correct?  It seems like all it does is just change the high order bit before doing the subtraction.  It needs to check if the parity changed in that bit before and after the subtraction.  I actually already have plans to optimize this since I will be able to use the parity/overflow flag once I get relative jump replacement working with the axioms (so I can carry that feature over to the built-in commands).
___Axe_Parser___
Today the calculator, tomorrow the world!

#### calc84maniac

• eZ80 Guru
• Coder Of Tomorrow
• LV11 Super Veteran (Next: 3000)
• Posts: 2912
• Rating: +471/-17
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #82 on: November 10, 2010, 09:20:06 pm »
Changing the high order bit does work, actually. It changes a comparison in the -32768 to 32767 range to a comparison in the 0 to 65535 range (effectively changing from a signed comparison to an unsigned comparison).
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

#### Runer112

• Project Author
• LV11 Super Veteran (Next: 3000)
• Posts: 2289
• Rating: +639/-31
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #83 on: November 10, 2010, 11:55:08 pm »
Changing the high order bit does work, actually. It changes a comparison in the -32768 to 32767 range to a comparison in the 0 to 65535 range (effectively changing from a signed comparison to an unsigned comparison).

Yup This is the only signed comparison for which this method is better though.

Do all the bit optimizations look correct by the way?

EDIT: If you plan on optimizing the signed comparisons to use the parity/overflow flag, you might want to check into that a bit. I was playing around with signed comparisons and wabbitemu was telling me very strange things. It seemed to tell me that signed comparisons relied on an xor of the p/v and s flags. Which makes no sense, but that's what wabbitemu was telling me. See table below.

 hl de sbc hl,de c p/v s hl>>de 2000 6000 C000 1 0 1 0 2000 A000 8000 1 1 1 1 2000 E000 4000 1 0 0 1 6000 2000 4000 0 0 0 1 6000 A000 C000 1 1 1 1 6000 E000 8000 1 1 1 1 A000 2000 8000 0 0 1 0 A000 6000 4000 0 1 0 0 A000 E000 C000 1 0 1 0 E000 2000 C000 0 0 1 0 E000 6000 8000 0 0 1 0 E000 A000 4000 0 0 0 1
« Last Edit: November 11, 2010, 12:20:11 am by Runer112 »

#### DJ Omnimaga

• Clacualters are teh gr33t
• CoT Emeritus
• LV15 Omnimagician (Next: --)
• Posts: 55942
• Rating: +3154/-232
• CodeWalrus founder & retired Omnimaga founder
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #84 on: November 11, 2010, 12:32:17 am »
Yeah, I'm still reading all of this, even though I'm less active, I still visit just about every day   I've even been able to do a little more progress with Axe even with my busy schedule.
Ah phew, good to hear . Still, I hope the schedule won't get even more hectic with the time. X.x
Now active at https://discord.gg/cuZcfcF (CodeWalrus server)

#### calc84maniac

• eZ80 Guru
• Coder Of Tomorrow
• LV11 Super Veteran (Next: 3000)
• Posts: 2912
• Rating: +471/-17
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #85 on: November 11, 2010, 01:04:30 am »
It seemed to tell me that signed comparisons relied on an xor of the p/v and s flags. Which makes no sense, but that's what wabbitemu was telling me.
It actually does make a bit of sense. Whether the mathematical (non-overflowed) result of the subtraction is positive or negative should give you the result of the comparison.  However, if there was a signed overflow, it will give the wrong result. So the sign flag needs to be inverted if there was an overflow, and XOR achieves this perfectly.
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

#### Runer112

• Project Author
• LV11 Super Veteran (Next: 3000)
• Posts: 2289
• Rating: +639/-31
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #86 on: November 11, 2010, 01:43:50 am »
Yeah, my point is that Quigibo is probably just better off using the signed comparisons he already uses instead of bothering with the p/v flag, because it gets messy.
« Last Edit: November 11, 2010, 01:44:07 am by Runer112 »

#### calc84maniac

• eZ80 Guru
• Coder Of Tomorrow
• LV11 Super Veteran (Next: 3000)
• Posts: 2912
• Rating: +471/-17
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #87 on: November 15, 2010, 08:27:44 am »
Actually, the main reason he didn't use the p/v flag is because his routines didn't support absolute jumps. They apparently do now, so some speed-up using these flags might be possible.
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

#### Runer112

• Project Author
• LV11 Super Veteran (Next: 3000)
• Posts: 2289
• Rating: +639/-31
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #88 on: November 28, 2010, 06:04:40 pm »
Cool, Quigibo added all my optimized auto-optimizations But I think you missed p_GetBit15, which can be optimized to be the same as p_Mod2.

#### Munchor

• LV13 Extreme Addict (Next: 9001)
• Posts: 6199
• Rating: +295/-121
• Code Recycler
##### Re: Assembly Programmers - Help Axe Optimize!
« Reply #89 on: November 28, 2010, 06:08:28 pm »
Cool, Quigibo added all my optimized auto-optimizations But I think you missed p_GetBit15, which can be optimized to be the same as p_Mod2.

Great! So, it optimizes the Axe script or the Asm conversion?

Like, the following program:

Code: [Select]
Output(0,0,"Hello World")
Is optimized to:

Code: [Select]
Output(0,0,"Hello World
Or is it Assembly that is optimized?