• ASM Optimized routines 5 1
Currently:

### Author Topic: ASM Optimized routines  (Read 61241 times)

0 Members and 1 Guest are viewing this topic.

#### Galandros

• LV9 Veteran (Next: 1337)
• Posts: 1140
• Rating: +42/-10
##### ASM Optimized routines
« on: February 28, 2010, 07:27:53 am »
There are some cools optimized routines around. Calcmaniac is the recordist in z80, probably. At least in calculators z80 forums is.

On to the code:
Code: [Select]
;calcmaniac84cpHLDE: or a sbc hl,de add hl,de ret;Important note: because the code is 3 bytes and a call is 3 bytes, just macro in:;SPASM, TASM and BRASS compatible, I guess#define cp_HLDE  or a \ sbc hl,de \ add hl,de;- Reverse a;input: Byte in A;output: Reversed byte in A;destroys B;Clock cycles: 66;Bytes: 18;author: calcmaniac84reversea: ld b,a rrca rrca xor b and %10101010 xor b ld b,a rrca rrca rrca rrca xor b and %01100110 xor b rrca ret;reverse hl;curiosity: a easy port of a common reverse A register is more efficient than tricky stuff;calcmaniac84;28 bytes and 104 cyclesld a,lrlarr hrlarr hrlarr hrlarr hrlarr hrlarr hrlarr hrlarr hrlarrcald l,aret;calc84maniac;in: a = ABCDEFGH;out: hl= AABBCCDDEEFFGGHHrrcarrarrald l,arrasra lrlarr lsra lrrarr lsra lrrcarrarrald h,arrasra hrlarr hsra hrrarr hsra hret
Code: [Select]
;Galandros optimized routines;try to beat me... maybe is possible...;Displays A register content on screen in decimal ASCII number, using no addition memoryDispA: ld c,-100 call Na1 ld c,-10 call Na1 ld c,-1Na1: ld b,'0'-1Na2: inc b add a,c jr c,Na2 sub c ;works as add 100/10/1 push af ;safer than ld c,a ld a,b ;char is in b CALL PUTCHAR ;plot a char. Replace with bcall(_PutC) or similar. pop af ;safer than ld a,c ret;Note the following one is optimized for RPGs menus and the such, it is quite flexible. I am going to use in Lost Legends I ^^;I started with one which used addition RAM for temporary storage (made by me, too), and optimized for size, speed and no extra memory use! ^.^;the inc's and dec's were trick to debug -.-", the registers b and c are like counters and flags;DispHL for games;input: hl=num, d=row,e=col, c=number of algarisms to skip;number of numbers' characters to display: 5 ; example: 65000;output: hl displayed, with algarisms skiped and spaces for initial zerosDispHL_games: inc c ld b,1 ;skip 0 flag ld (CurRow),de;Number in hl to decimal ASCII;Thanks to z80 Bits;inputs: hl = number to ASCII;example: hl=300 outputs '  300';destroys: af, hl, de used ld de,-10000 call Num1 ld de,-1000 call Num1 ld de,-100 call Num1 ld e,-10 call Num1 ld e,-1Num1: ld a,'0'-1Num2: inc a add hl,de jr c,Num2 sbc hl,de dec c ;c is skipping jr nz,skipnum inc c djnz notcharnumzero cp '0' jr nz,notcharnumzeroleadingzero: inc bskipnum: ld a,' 'notcharnumzero: push bc call PUTCHAR  ;bcall(_PutC) works, not sure if it preserves bc pop bc retPUTCHAR: bcall(_PutC) ret;Example usage of DispHL_games to understand what I meanTest2: ld hl,60003 ld de,$0101 ld c,0 call DispHL_games ld hl,60003 ld de,$0102 ld c,1 call DispHL_games ret
Well, don't try to understand or optimize calcmaniac84 ones. j/k, trying to understand can be harsh (tip: have a good instruction set summary) but teaches some inner details of the z80 asm.
About mine, do your best.
Hobbing in calculator projects.

#### Quigibo

• The Executioner
• CoT Emeritus
• LV11 Super Veteran (Next: 3000)
• Posts: 2031
• Rating: +1075/-24
• I wish real life had a "Save" and "Load" button...
##### Re: ASM Optimized routines
« Reply #1 on: February 28, 2010, 05:21:57 pm »
Here is a little optimization I use but haven't really seen around.  When you need a direct key press, you have to wait about 7 clock cycles between setting the port and reading it.  Most people just fill in the extra space with a waste instruction like this:

Code: [Select]
ld a,xxout (1),ald a,(de)in a,(1)and yy9 Bytes, 43 T-States.

You can actually use the waste instruction to do something useful.  It gives a slight speed increase.

Code: [Select]
ld a,xxout (1),ald b,yyin a,(1)and b9 Bytes, 40 T-States.
« Last Edit: February 28, 2010, 05:23:48 pm by Quigibo »
___Axe_Parser___
Today the calculator, tomorrow the world!

#### calc84maniac

• eZ80 Guru
• Coder Of Tomorrow
• LV11 Super Veteran (Next: 3000)
• Posts: 2898
• Rating: +467/-17
##### Re: ASM Optimized routines
« Reply #2 on: February 28, 2010, 08:12:27 pm »
Small and quick setup for IM 2 (this example sets up vector table at $9900 and interrupt jump at$9a9a, but values can be changed)
Code: [Select]
dild a,$99ld bc,$0100ld h,ald d,ald l,cld e,bld i,ainc ald (hl),aldirld l,ald (hl),$c3inc lld (hl),intvec &$ffinc lld (hl),intvec >> 8im 2ei
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

#### Galandros

• LV9 Veteran (Next: 1337)
• Posts: 1140
• Rating: +42/-10
##### Re: ASM Optimized routines
« Reply #3 on: April 24, 2010, 12:12:44 pm »
I found this optimized routine around. It is as far optimized as z80 string copy can get.
Code: [Select]
;author: calcmaniac84, I think;Copy zero terminated string at HL to DE.StrCopy: xor adocopystr: cp (hl) ldi jr nz,docopystr ret
These are quite optimized. But may be is possible to optimize further. (speed and size) But it is not needed...
They shift a graphics buffer (optimized to 96x64) up or down by pixels passed in A register.
Code: [Select]
scroll_up:#ifdef DEBUG cp 64+1 call c,ErrorOverFlow#endif add a,a add a,a ld l,a ld e,a ld h,0 ld d,h add hl,hl add hl,de ; hl=a*12 push hl ld de,768 ex de,hl; carry is never set here if input is correct; or a sbc hl,de ld b,h ld c,l ; bc=768-12*a ex de,hl ld de,plotsscreen add hl,de ldir;blank remaining area ld h,d ld l,e inc de ld (hl),$00 pop bc dec bc ; bc=12*a-1 ldir ret;PSEUDO CODE; ld hl,plotsscreen+12*a; ld de,plotsscreen; ld bc,768-12*a; ldir; ld h,d; ld l,e; ld (hl),$00; inc de; ld bc,12*a; dec bc; ldir; retscroll_down:#ifdef DEBUG cp 64+1 call c,ErrorOverFlow#endif; a can be from 1 to 63; a can be multiplied by 4 add a,a add a,a ; a*4 ld l,a ; hl = a*4 ld e,a xor a ld h,a ld d,a add hl,hl ; hl = a*8 add hl,de ; hl = a*12 ld e,a ; de = 0 push hl ; a*12 will needed later push hl ; 2 times ex de,hl;carry is never set here; or a sbc hl,de ; hl= -a*12, de=a*12 ld de,plotsscreen+767 add hl,de ; hl=plotsscreen+767-12*a pop bc push hl ld hl,768+1;carry always set; or a sbc hl,bc ld b,h ld c,l pop hl lddr;blank remaining area ld h,d ld l,e ld (hl),$00 dec de pop bc dec bc lddr ret; ld hl,plotsscreen+767-12*a; ld de,plotsscreen+767; ld bc,768-12*a; lddr; or; ld (hl),$00 ;; ld hl,plotsscreen; ld h,d ;; ld (hl),$00; ld l,e ;; ld de,hl+1; dec de ;; ld bc,12*a-1; ld bc,12*a-1 ;; ldir; lddr ;; ret; ret « Last Edit: April 24, 2010, 12:15:14 pm by Galandros » Hobbing in calculator projects. #### mapar007 • LV7 Elite (Next: 700) • Posts: 550 • Rating: +28/-5 • The Great Mata Mata ##### Re: ASM Optimized routines « Reply #4 on: April 25, 2010, 03:58:56 am » Very nice! I'll add these to my utils.z80 file that is included in all my app builds. Anyone wanting to compile a stdlib.c and revive the tisdcc project? j/k #### Galandros • LV9 Veteran (Next: 1337) • Posts: 1140 • Rating: +42/-10 ##### Re: ASM Optimized routines « Reply #5 on: April 25, 2010, 05:04:47 am » Very nice! I'll add these to my utils.z80 file that is included in all my app builds. Anyone wanting to compile a stdlib.c and revive the tisdcc project? j/k Actually I am working on something like that. I am hand writing C functions in z80 assembly just for fun. I will share them when I finish. After seeing Axe Parser, it seems that is possible doing a good C compiler for z80. And we have documentation on how to optimize z80 assembly to do a optimizer, check the WikiTI topic: http://wikiti.brandonw.net/index.php?title=Z80_Optimization. « Last Edit: April 25, 2010, 05:14:53 am by Galandros » Hobbing in calculator projects. #### DJ Omnimaga • Former TI programmer • CoT Emeritus • LV15 Omnimagician (Next: --) • Posts: 55882 • Rating: +3151/-232 • CodeWalrus founder & retired Omnimaga founder ##### Re: ASM Optimized routines « Reply #6 on: April 25, 2010, 12:19:53 pm » Very nice! I'll add these to my utils.z80 file that is included in all my app builds. Anyone wanting to compile a stdlib.c and revive the tisdcc project? j/k I think I remember this, it was Halifax from the old Omnimaga forums who worked on it, right? There was a thread about it somewhere #### Quigibo • The Executioner • CoT Emeritus • LV11 Super Veteran (Next: 3000) • Posts: 2031 • Rating: +1075/-24 • I wish real life had a "Save" and "Load" button... ##### Re: ASM Optimized routines « Reply #7 on: April 29, 2010, 05:59:58 pm » Quigibo's Challenge! Can any of the following be done in 6 or fewer bytes? The input and output must be HL. • Multiply by 128? • Signed division by any nontrivial constant, other than 2, including negative numbers? • Modulus with any constant that is not a power of 2? I'm rewriting my math engine almost from scratch so I decided I would just optimize everything I could possibly conceive of at the same time. These are the ones I'm having trouble finding. ___Axe_Parser___ Today the calculator, tomorrow the world! #### calc84maniac • eZ80 Guru • Coder Of Tomorrow • LV11 Super Veteran (Next: 3000) • Posts: 2898 • Rating: +467/-17 ##### Re: ASM Optimized routines « Reply #8 on: April 29, 2010, 06:31:16 pm » Seems pretty impossible to me. "Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman #### Quigibo • The Executioner • CoT Emeritus • LV11 Super Veteran (Next: 3000) • Posts: 2031 • Rating: +1075/-24 • I wish real life had a "Save" and "Load" button... ##### Re: ASM Optimized routines « Reply #9 on: April 29, 2010, 06:58:39 pm » Okay, that's good. I spent hours trying to optimize some of these using all the tricks I know. That reassures me it was a wild goose chase. ___Axe_Parser___ Today the calculator, tomorrow the world! #### DJ Omnimaga • Former TI programmer • CoT Emeritus • LV15 Omnimagician (Next: --) • Posts: 55882 • Rating: +3151/-232 • CodeWalrus founder & retired Omnimaga founder ##### Re: ASM Optimized routines « Reply #10 on: April 29, 2010, 07:01:08 pm » Seems pretty impossible to me. No way! You're calc84god, you can do everything, even the impossible! (see TI-Boy SE/Project M/F-Zero) j/k I can't wait to see what kind of optimizations there will be in the next versions of Axe #### Quigibo • The Executioner • CoT Emeritus • LV11 Super Veteran (Next: 3000) • Posts: 2031 • Rating: +1075/-24 • I wish real life had a "Save" and "Load" button... ##### Re: ASM Optimized routines « Reply #11 on: April 29, 2010, 07:34:45 pm » It's nothing big. Mostly it just extend multiplication, modulus, and addition to higher powers of 2. The big optimizations won't come for a long time unfortunately. Functionality is more important right now. By the way, is there a better way to display hl at the coordinates (xx,yy) than this? Code: [Select] B_CALL(_SetXXXXOP2)B_CALL(_Op2ToOP1)ld hl,$yyxxld (PenCol),hlld a,5B_CALL(_DispOP1A)
Its seems really roundabout to me.  Is there a bcall I don't know about that does this automatically?
___Axe_Parser___
Today the calculator, tomorrow the world!

#### calcdude84se

• Needs Motivation
• LV11 Super Veteran (Next: 3000)
• Posts: 2272
• Rating: +78/-13
• Wondering where their free time went...
##### Re: ASM Optimized routines
« Reply #12 on: April 29, 2010, 07:57:10 pm »
yeah, there's _DispHL
so you're code would be:
Code: [Select]
push hlld hl,$yyxxld (PenCol),hlpop hlB_CALL(_DispHL)Just be aware it's right-justified in 5 spaces. (Since$ffff is 5 decimal digits, 65535)
EDIT: oh, wait, that's pencol? so this code doesn't work then. Oops...
« Last Edit: April 30, 2010, 05:49:37 pm by calcdude84se »
"People think computers will keep them from making mistakes. They're wrong. With computers you make mistakes faster."
-Adam Osborne
Spoiler For "PartesOS links":
I'll put it online when it does something.

#### calc84maniac

• eZ80 Guru
• Coder Of Tomorrow
• LV11 Super Veteran (Next: 3000)
• Posts: 2898
• Rating: +467/-17
##### Re: ASM Optimized routines
« Reply #13 on: April 29, 2010, 10:27:56 pm »
He's talking about graph screen display.
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

#### Galandros

• LV9 Veteran (Next: 1337)
• Posts: 1140
• Rating: +42/-10
##### Re: ASM Optimized routines
« Reply #14 on: April 30, 2010, 09:21:30 am »
Quigibo's Challenge!

Can any of the following be done in 6 or fewer bytes?  The input and output must be HL.

• Multiply by 128?
• Signed division by any nontrivial constant, other than 2, including negative numbers?
• Modulus with any constant that is not a power of 2?
Challenge accepted.

Answer to the multiplication by 128 in 6 bytes:

I started coding a routine that multiply A by 128:
Spoiler For Spoiler:
; The old trick to multiply by 256, by moving the low byte to high byte
ld h,a
xor a   ; resets carry
rr h     ; divide h by 2
rra      ; and pass bit 0 to a
ld l,a   ; store to l
; hl is a*128

After that, I very easily modified to (hl*128)%((2^16)-1). Unsigned version:
Spoiler For Spoiler:
ld h,l
xor a
rr h
rra
ld l,a
; 6 bytes and 24 clocks to multiply hl by 128, not bad O_o

I am very sure this routines works but I have not tested.
EDIT4: tested with a few values, it works.

EDIT3:
Multiply hl by 128, now signed. If I am right, to do signed, you only need to preserve the bit 7? If that's so:
Spoiler For Spoiler:
ld h,l
xor a
sra h
rra
ld l,a
; 6 bytes, 24 clocks, too

Now I will think about the others when I have more free time. Fun, fun, fun.
Give me some time, please.
EDIT: I am thinking in putting some of this challenges in WikiTI when we end the challenge. And maybe Axe's routines. If you have other routines/challenges of optimization share to see what I can do.
EDIT2: fixed a bug/typo and commented even more the code
« Last Edit: April 30, 2010, 01:18:05 pm by Galandros »
Hobbing in calculator projects.