Omnimaga

Calculator Community => TI Calculators => ASM => Topic started by: ralphdspam on May 04, 2011, 01:42:35 am

Title: 4*4 Sprite Routine Optimization/Help
Post by: ralphdspam on May 04, 2011, 01:42:35 am

Hi, I just made this 4*4 sprite routine aligned to a 16*24 (half pixel) grid.
Can anyone help with optimizations or suggestions?

http://typewith.me/BSgvbUFXDh

Code: [Select]

;==========
;XOR 4*4
;==========
;Inputs:
;ix = sprite address
;c = X Position
;b = Y Position
;Outputs:
;a = last line of sprite displayed
;b = 0
;c = X position (undestroyed ^^)
;de = 12
;hl = buffer location of sprite + 12 (next line)
;ix = sprite address + 2 (next sprite?)
;==========
;This was made to be a standalone program.
#ifdef bcall(xxxx)
;THE GAME
#else
#include "ti83plus.inc"
.org $9d93
.db t2ByteTok, tAsmCmp
#endif
;==========
;Just for testing purposes:
        bcall(_clrlcdfull)
        bcall(_runindicoff)
        bcall(_grbufclr)
        
        ld ix, x4sprite ;ix holds sprite address
        ld c, 1 ;c holds X location
        ld b, 1 ;b holds Y location
;==========
        
FourXOR:
        
        ld a, b        
        add a, a ;\A * 3
        add a, b ;/
        add a, a ;\A * 4
        add a, a ;/
        
        ld h, 0 ;A*2 is too large to fit into 8 bits, so we must use a 16 bit register.
        ld l, a        
        add hl, hl ;\A * 4
        add hl, hl ;/Align Y pos to the X4 grid.
        
        ld d, 0
        ld e, c        
        srl e ;Now, we are only concerned on the X8 alignment.  We will fine tune it to X4 later.
        add hl, de ;add X pos to Y pos

        ld de, plotsscreen ;Finally, we add the Buffer location.
        add hl, de
        
        
        ld b, 2 ;B*2 Pixels tall sprite        
        ld e, 12 ;Width of Screen
        
_FourDispByte:
        ld a, (ix) ;loads sprite at ix
        and %11110000 ;use left side of byte
        
        bit 0, c ;If X pos is an odd number, we have to shift to the right.
        jr z, _FourDispN1 ;if D is aligned to the byte, branch.  Else, shift sprite by 4.
        
        srl a ;shift sprite to the right 4 pixels.
        srl a
        srl a
        srl a
        
_FourDispN1:
        ld d, (hl)
        xor d ;xor sprite onto current buffer contents
        ld (hl), a
        
        ld d, 0 ;DE = 12
        add hl, de ;shift pen down one row
        
        ld a, (ix)
        and %00001111 ;It is on the right side of the byte.
        
        bit 0, c
        jr nz, _FourDispN2 ;if d is not aligned to the byte, branch.
        
        add a, a ;shift sprite to the left 4 pixels
        add a, a
        add a, a
        add a, a
        
_FourDispN2:
        ld d, (hl)
        xor d ;xor sprite onto current buffer contents
        ld (hl), a
        
        inc ix
        ld d, 0
        add hl, de
        djnz _FourDispByte
        
;==========
;More pseudo random stuff just for testing
        bcall(_grbufcpy)
        bcall(_getkey)
;==========
        ret
        
X4Sprite:
.db %10010110
.db %01101001

Also, Zeda's routines can be found here:
http://typewith.me/V0lEvy11lc

Runer112's routine:
http://typewith.me/Zv15SUf9Ve

Thanks. :)

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: ralphdspam on May 04, 2011, 10:46:34 pm

Hmm... It's been 6 hours, right? (This can also probably count as an update too.)

Here is the current routine:
EDIT: See above. ;)

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: Xeda112358 on May 05, 2011, 08:44:03 am

Okay, this is a code I put together... I was going to use the code from BatLib for displaying 4x6 sprites (for fonts), because it used nibble data (so that each 4x6 sprite used 3 bytes), but I messed up when I was trying to optimise it XD Anywho, here is this... I don't even know if it is more or less optimised than your routine, I just wanted to make one, too XD

Code: [Select]

;=============================
DrawSprite4x4:
;=============================
;Inputs:
;     C is the y coordinate (0 to 15)
;     B is the column to draw to (0 to 23)
;     DE points to the font data
;Outputs:
;     A is 1
;     BC is 12
;     DE is incremented by 4 (pointing to next sprite?)
;     HL is incremented by 30h
;=============================
    ld a,b           ;78
    ld b,0           ;0600
    ld h,b           ;60
    ld l,c           ;69
    add hl,hl        ;29
    add hl,bc        ;09
    add hl,hl        ;29
    add hl,hl        ;29
    add hl,hl        ;29
    add hl,hl        ;29
    rra              ;1F
    ld c,a           ;4F
    push af          ;F5
    add hl,bc        ;09
    ld bc,plotSScreen  ;014093
    add hl,bc        ;09
    ld b,4           ;0604
    pop af           ;F1
    ld a,$F0         ;3EF0
    jr c,RightMask   ;3801
      cpl            ;2F
Rightmask:
    ld (asm_flags1),a  ;32118A
DrawTheSprite:
    ld a,(asm_flags1)  ;3A118A
    ld c,(hl)        ;4E
    and c            ;A1
    ld (hl),a        ;77
    ld a,(de)        ;1A
    and $F0          ;E6F0
    bit 0,(iy+asm_flags1)  ;FDCB2146
    jr nz,NoShift    ;2004
      rlca           ;07
      rlca           ;07
      rlca           ;07
      rlca           ;07
NoShift:
    or (hl)          ;B6
    ld (hl),a        ;77
    ld a,b           ;78
    ld bc,12         ;010C00
    add hl,bc        ;09
    ld b,a           ;47
    inc de           ;13
    djnz DrawTheSprite  ;10E2
    ret              ;C9

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: ralphdspam on May 05, 2011, 08:24:23 pm

Thanks! I will try this later. :D

EDIT: I found out that AND can take imm₈ values! Saved myself a couple of bytes. :)

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: Xeda112358 on May 05, 2011, 08:50:31 pm

Nice :) There are a lot of fun tricks like that >.> As a note about the "cpl" instruction, that inverts the bits in the "a" register. So if it was 01100100, cpl would change a to 10011011 :)

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: ralphdspam on May 05, 2011, 09:07:07 pm

Zeda, I moved your routine to its own dedicated TypeWith.me page. ;)
http://typewith.me/V0lEvy11lc

I also updated my first post with your link and the updated code.

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: Runer112 on May 06, 2011, 01:17:34 am

Sorry it took me so long to get together a routine, but I was busy with some other stuff. My routine is in the style of Xeda's routine. It reads from a 4-byte sprite and uses overwrite logic, and for the most part mimics her routine. Except it's improved. ;D

7 bytes smaller
~160 cycles faster
Doesn't use any RAM

It also allows for drawing to a buffer besides plotSScreen, although I didn't mention this because hers could easily be modified for this as well. You can also view the routine at http://typewith.me/Zv15SUf9Ve (http://typewith.me/Zv15SUf9Ve). Xeda or anyone else, feel free to grab this routine for your own projects. :)

Code: [Select]

PutSprite4x4:
;———————————————————54 bytes———————————————————;
;ENTRY POINT #1: PutSprite4x4
;—> Draws a 4x4 sprite to plotSScreen, aligned to a 24x16 grid
;INPUTS:    a=row (0-15)    c=column (0-23)    de=sprite
;OUTPUTS:   a=0    bc=12    de=sprite+4    hl=((row+4)*48)+(column/2)+plotSScreen
;FLAGS:     S=0  Z=1  H=0  V=0  N=1  C=column mod 2
;—————————————————~620 cycles——————————————————;
;ENTRY POINT #2: PutSprite4x4_AnyBuf
;—> Draws a 4x4 sprite to the specified buffer, aligned to a 24x16 grid
;INPUTS:    a=row (0-15)    c=column (0-23)    de=sprite    hl=buffer
;OUTPUTS:   a=0    bc=12    de=sprite+4    hl=((row+4)*48)+(column/2)+buffer
;FLAGS:     S=0  Z=1  H=0  V=0  N=1  C=column mod 2
;—————————————————~610 cycles——————————————————;
    ld hl,plotSScreen
PutSprite4x4_AnyBuf:
    push hl
    ld b,a
    add a,a
    add a,b
    add a,a
    add a,a
    ld h,0
    ld l,a
    ld b,h
    add hl,hl
    add hl,hl
    rr c
    rla
    add hl,bc
    pop bc
    add hl,bc
    ld bc,12
    rra
    ld a,4
__PutSprite4x4_Loop:
    push af
    ld a,(hl)
    jr c,__PutSprite4x4_Loop_AlignLeft
    and %00001111
    ld (hl),a
    ld a,(de)
    jr __PutSprite4x4_Loop_AlignEnd
    and %11110000
    ld (hl),a
    ld a,(de)
    rra
    rra
    rra
    rra
__PutSprite4x4_Loop_AlignEnd:
    or (hl)
    ld (hl),a
    add hl,bc
    inc de
    pop af
    dec a
    jr nz,__PutSprite4x4_Loop
    ret

EDIT: Found a way to save 12 more cycles.

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: ralphdspam on May 06, 2011, 01:28:31 am

Ah, looks well optimized! :)

Quote from: Runer112 on May 06, 2011, 01:17:34 am

Xeda or anyone else, feel free to grab this routine for your own projects. :)

Thanks, I will. :)

Title: Re: 4*4 Sprite Routine Optimization/Help
Post by: Xeda112358 on May 06, 2011, 11:06:57 am

Nice :D I am still working on optimising my routines for BatLib, but I have the one that uses 3 bytes for 4x6 sprites. If I find the time to convert it, I will post it, but I don't think I will have the time :/