Omnimaga
Calculator Community => TI Calculators => ASM => Topic started by: TheMachine02 on April 24, 2014, 06:06:47 am
-
So, I was trying to have a routine who update only part of screen who have to be updated (since wireframe 3D as a lot of white in there...), but my routine totally failed.
I suppose there is a place where I didn't put enough dealy (wich I have specified), but increasing a delay here doesn't relly change anything :P
Basically, it take two buffer, pointed by hl-767 and de-767, hl is wich is on the screen and de is what I want to display.
So I was wondering if there is a way to correct this routine and/or make more faster and optimized.
_BufferFlip:
ld a, $06
out ($10), a
ld a, $BF
out ($10), a
;set LCD to Y-decrement mode and set the max row
_OutLoop:
push af
ld b, $0C
_GetByte:
ld a, (de)
cp (hl)
ld (de), 0
ld (hl),a
jr nz, _PutByte
dec hl
dec de
djnz _GetByte
;compare until byte is find
pop af
dec a
cp $80
;if 0, a=$80, and return.
jr z, _End
jp _OutLoop:
_PutByte:
push af
ld a, b
add a, $33
;set the column
out ($10), a
pop af
;here not enough delay !!!
out ($11), a
;write the byte
jp _GetByte
_End:
ld a,$05
out ($10), a
;put back LCD in the correct mode
ret
-
You need a delay between every writings to port $10 or $11.
So, basically, every time you use a out instruction to port $10 or $11, you will have to be sure to have enough delay before writing again.
For example, when you set row (second "out ($10),a"), it is ignored by the lcd cause he's busy activating y-dec mode.
-
I'm not convinced that using a custom display routine like that would really save you that much speed. The only possible merit would be if it runs at 15MHz, and that's still a small one.
At 6MHz, even if nothing changed between one frame and the last, my best efforts at prototyping a fast diffing display algorithm that also clears the last frame buffer resulted in something that still runs in about 2/3 of the time that Axe's standard DispGraphClrDraw runs in. That may sound like a big speed boost, but when you take into account the other processing that has to be done each frame, it really isn't. Assuming the current framerate is relatively low (<=30) and that's why you were searching for this boost, the framerate would only improve by about framerate/3 percent; 30fps would become 33fps, 20fps would become 21.3fps, and 10fps would become 10.3fps. The total percentage gains only start becoming substantial above 30fps, but if the framerate is above 30fps, you don't really need the gains anyways.
At 15MHz, it might be worth it. With my prototype algorithm, if nothing changed between one frame and the next, it would run in about 1/4 of the time that Axe's standard DispGraphClrDraw runs in. With a relatively low framerate (<=30), it would improve by about framerate*4/5 percent; 30fps would become 37.6fps, 20fps would become 23.1fps, and 10fps would become 10.7fps. But keep in mind that these are numbers for the best case scenario, in which nothing at all has changed between frames. Even if only a quarter of the buffer bytes changed, I suspect those gains would be about cut in half. If about half of the buffer bytes changed, the gains would probably be cut down to a quarter.
In conclusion: you'd probably be better off spending your time looking for optimization potential elsewhere, namely in the graphics rendering code itself. :P If you want to fill me/us in on the kind of computations that are done for graphics on an average frame, I/others might be able to give some ideas.
-
Seing thing like, that, it is indeed not really optimization bringing. Ok then. Let's try something else. :P