Omnimaga
Calculator Community => TI Calculators => ASM => Topic started by: Hot_Dog on August 19, 2011, 11:37:35 pm
-
Although we give index registers a bad name, they can be VERY useful at times, especially if you know what you're doing. Feel free to post here about a time you used IX and IY for more than just "an extra HL."
For me, I'm working on something which I will announce later, but it involves grayscale. It started out as monochrome, and IX pointed to plotscreen while HL pointed to necessary sprites and textures. Then I finally realised that I could do grayscale by drawing to two different buffers (a *duh* moment, but after all I'm new to grayscale), so I used IY to redraw exactly what was drawn to IX, but on a second buffer
-
Not entirely sure this counts as "just an extra HL" but I do recall using IX in a 32-bit scoreboard, where IX couldn't be used in an ADC instruction. The input was basically IX:HL. IY's role? Pointing to a table of bytes that need to be output to the screen in the meantime. SP at the time was pointing to a table of 32 bit values that corresponded to alternating pairs of negative and positive powers of tens, starting at one billion.
I've got to think of other ways I've used and abused the index registers, but I do recall other instances in my other projects. Gotta dig 'em up.
ld a,10 ;48 ten digits to look after. 1.000.000.000
DSBDECSCOLP:
ex af,af' ;52 62
pop bc ;62 72
pop de ;72 82
ld a,(iy+0) ;91 101
out ($11),a ;11
ld a,$FF ;18
inc a ;22
add ix,bc ;37
adc hl,de ;52
jr c,$-5 ;59
ld c,a ;63
ld a,(iy+1) ;82
out ($11),a ;11
pop de ;21
add ix,de ;36
pop de ;46
ld a,(iy+2) ;65 arghamahblabbles!
nop ;69 BAD CODE! WORST ENEMY! *rolls newspaper* WHAP WHAP WHAP
out ($11),a ;11
adc hl,de ;26
ld a,c ;30
add a,a ;34
add a,a ;38
ld c,a ;42
ld b,0 ;49
ld a,(iy+3) ;68
out ($11),a ;11
ld iy,NumberTable ;25
add iy,bc ;40
ex af,af' ;44
dec a ;48
jp nz,DSBDECSCOLP ;58
ld sp,(itemp1)
ld iy,myflags
jp DSBSCORCOLL
Nevermind the comments. It was a late night with no caffeine.
EDIT: The scoreboard routine is written inline with the LCD update routine for the scoreboard. This is CaDan, so... yeah. I kinda needed the speed where it was available but I had to make sure that no timings were violated for even the fastest condition. That's the timings that was listed.
EDIT2: An almost identical run of code which sets up IY for the first run is not shown. That initializing code was used to render parts of the sprite above the score.
-
I must post this as I know of the most IX addicted program made...
When reading the asm output for GlaßOS, I noticed that it uses a standard way of passing parameters to functions. Push variables, then use IX like hell. Yes, SDCC has a drink..,er, IX problem. I know that index registers add an extra byte or so to op sizes, and are slow. I can see a lot of tios asm programmers not liking that variables, even one byte, aren't passed by register but by the stack. What about IY? Its dedicated for the interrupt, just like IX.
The only upside is the straight-forwardness of a function that would take unlimited parameters and making accessing them easy, but too much overkill. SDCC, you scare me most of the time.
-
Here's a strip of code that uses IX/IY in a manner more consistent with how a person might use HL. Except IX. This is an example where I pretty much run out of registers. I could probably have used DE' in place of IY, though. And SP was out of the question. Maybe I could've used register I for storage.
The trick to using IY lies in not using any romcalls or allowing any TI-OS interrupts while you're (ab)using it. All bets are off if you are writing your own OS, but still. Restoring IY is even easier. "ld iy,flags"
workonthisrow:
call leftiter
call centeriter
call fourthiter
ld a,10
Workonthisrowsub:
push af
call firstiter
call centeriter
call fourthiter
pop af
dec a
jr nz,Workonthisrowsub
call firstiter
call centeriter
call rightiter
ret
;External setup:
;HL = LUT for bit comparison
;HL'= LUT for result testing
; Two LUTs are indexed by HL by incrementing and decrementing H (256 byte wide tables)
;IX = pointer to buffer 1 (reading)
;IY = pointer to buffer 2 (writing)
;
;Internal setup:
;D= row above
;E= row below
;C= current position
;B= temporary variable
;B'=center byte storage
;
;Registers used so far:
; AF, BC, DE, HL, AF', BC', HL', IX, IY
;
;Free registers:
; DE'
;
firstiter:
ld d,(ix-12)
ld e,(ix+12)
ld c,(ix+00)
ld a,(ix-13)
rrca
ld a,d
rra
and 11101110b
ld l,a
ld b,(hl)
ld a,(ix+11)
rrca
ld a,e
rra
and 11101110b
ld l,a
ld a,(hl)
ex af,af'
ld a,(ix-01)
rrca
ld a,c
rra
and 10101010b
ld l,a
ld a,c
ex af,af'
add a,(hl)
add a,b
exx
ld l,a
ex af,af'
ld b,a
and (hl) \ dec h
or (hl) \ inc h
and 10001000b
ld c,a
exx
ret
centeriter:
ld a,d
and 11101110b
ld l,a
ld b,(hl)
ld a,e
and 11101110b
ld l,a
ld a,(hl)
ex af,af'
ld a,c
and 10101010b
ld l,a
ex af,af'
add a,(hl)
add a,b
exx
ld l,a
ld a,b
and (hl) \ dec h
or (hl) \ inc h
and 01000100b
or c
ld c,a
exx
ld a,d
and 01110111b
ld l,a
ld b,(hl)
ld a,e
and 01110111b
ld l,a
ld a,(hl)
ex af,af'
ld a,c
and 01010101b
ld l,a
ex af,af'
add a,(hl)
add a,b
exx
ld l,a
ld a,b
and (hl) \ dec h
or (hl) \ inc h
and 00100010b
or c
ld c,a
exx
ret
fourthiter:
ld a,(ix-11)
rlca
ld a,d
rla
and 01110111b
ld l,a
ld d,(hl)
ld a,(ix+13)
rlca
ld a,e
rla
and 01110111b
ld l,a
ld e,(hl)
ld a,(ix+01)
rlca
ld a,c
rla
and 01010101b
ld l,a
ld a,(hl)
add a,e
add a,d
exx
ld l,a
ld a,b
and (hl) \ dec h
or (hl) \ inc h
and 00010001b
or c
ld (iy+0),a
exx
inc ix
inc iy
ret
;=============== side of screen routines
leftiter:
ld d,(ix-12)
ld e,(ix+12)
ld c,(ix+00)
ld a,(ix-01)
rrca
ld a,d
rra
and 11101110b
ld l,a
ld b,(hl)
ld a,(ix+23)
rrca
ld a,e
rra
and 11101110b
ld l,a
ld a,(hl)
ex af,af'
ld a,(ix+11)
rrca
ld a,c
rra
and 10101010b
ld l,a
ld a,c
ex af,af'
add a,(hl)
add a,b
exx
ld l,a
ex af,af'
ld b,a
and (hl) \ dec h
or (hl) \ inc h
and 10001000b
ld c,a
exx
ret
rightiter:
ld a,(ix-23)
rlca
ld a,d
rla
and 01110111b
ld l,a
ld d,(hl)
ld a,(ix+01)
rlca
ld a,e
rla
and 01110111b
ld l,a
ld e,(hl)
ld a,(ix-11)
rlca
ld a,c
rla
and 01010101b
ld l,a
ld a,(hl)
add a,e
add a,d
exx
ld l,a
ld a,b
and (hl) \ dec h
or (hl) \ inc h
and 00010001b
or c
ld (iy+0),a
exx
inc ix
inc iy
ret
;
;Subroutine code end
;======================================
This code can be found in my earlier 2D Cellular automata project, which is table-based to allow rulesets other than Conway's Game of Life.
My problem these days with IX and IY isn't exactly the fact that they're slow(er) and large(r) (than HL). They're okay to use when you really need them, but my beef is that you can't use IXL and IXH operations and expect them to work on the Nspire. There was ONE instance in which I did that stunt in CaDan (somewhere in the enemy data reading routine I think) and people complained they couldn't play the demo on their Nspire.
-
Not entirely sure this counts as "just an extra HL"
I don't know what counts as "just an extra HL" either ;) I guess I mean using IX and IY effectively more than 5 times in a section of code
-
When reading the asm output for GlaßOS, I noticed that it uses a standard way of passing parameters to functions. Push variables, then use IX like hell. Yes, SDCC has a drink..,er, IX problem. I know that index registers add an extra byte or so to op sizes, and are slow. I can see a lot of tios asm programmers not liking that variables, even one byte, aren't passed by register but by the stack. What about IY? Its dedicated for the interrupt, just like IX.
Passing arguments via the stack is standard for C. Though, you're right, it should try to optimize more.
-
Well, I've used IX in quite a few weird ways
I guess my main IX hack is using it as a pointer to where I need to jump next. For instance, in my TruSound, I use IX as the pointer to the main routine because jp (ix) is faster than jp imm16, in that I also use IY as the location of where I should jump to continue decoding the sound. Basically, I see where I'm at in the current decoding process, and give IYL a pointer to a jump table on where to go to on the next byte.
Another hack I recently figured out is that (iy - 3) is the end of plotSScreen. This means you can have your own set of flags that are guaranteed to be free to use. And since it uses the location that IY is already at, you're not even giving up a register.
Edit:
I also use IX when I am making my own bcall routine. Putting stuff in IX just leads to a smaller bcall routine.
-
Another hack I recently figured out is that (iy - 3) is the end of plotSScreen. This means you can have your own set of flags that are guaranteed to be free to use. And since it uses the location that IY is already at, you're not even giving up a register.
I never would have thought of that 0_0
-
Another hack I recently figured out is that (iy - 3) is the end of plotSScreen.
saveSScreen*
-
Personnaly, i often use ix half registers as 8-bit backups.
In this case, saving|restoring a value is twice as big than using the stack but when speed is the desired criteria, saving 5 t-states can definitely matter.
-
Personnaly, i often use ix half registers as 8-bit backups.
In this case, saving|restoring a value is twice as big than using the stack but when speed is the desired criteria, saving 5 t-states can definitely matter.
Actually, using 8-bit backups uses an average of 8 t-states when it comes to half registers. However, sometimes you run into code where using the stack in that instance could be dangerous (for example, pushing a value before you ret) and then the 8-bit backups come in handy
-
Actually, using 8-bit backups uses an average of 8 t-states when it comes to half registers.
That's right, 8 t-states, but you have to consider both saving & restoring operations to really see the total time you're saving :
1 push + 1 pop = 21 ts with 2 bytes
1 ld ixx,r + 1 ld r,ixx = 16 ts with 4 bytes
Half indexs are definitely cool, especially when coding without space restrictions (apps basically).