Omnimaga
Calculator Community => Casio Calculators => Topic started by: Ashbad on May 23, 2011, 10:26:31 am
-
Since Cemetech has one, I think we need one as well. This section is where you can post any routines for the casio Prizm you make in C or SH4 assembly. I don't think I have the liberty to post others' routines over here, but I'll be glad to start with some of my own, and I'm sure Qwerty will soon post some of his very well done SH4 routines [nice and optimized, too!] such as his LRNG. Here are a few in C, all written by me, and decently optimized:
Routine: Draw Alpha-coded sprite
inputs: pointer to data, x pos, y pos, width, height, alpha degree 1-255 (0 denotes no alpha value and normal drawing)
void AlphaSprite(short* data, int x, int y, int width, int height, char alpha) {
short* VRAM = (short*)0xA8000000;
int CurColor = 0;
VRAM += (LCD_WIDTH_PX*y)+x;
for(int j=y; j<y+height; j++) {
for(int i=x; i<x+width; i++) {
CurColor = (*(VRAM) + (alpha*(*(data++))/256))/2;
*(VRAM++) = CurColor % (65536);
}
VRAM += (LCD_WIDTH_PX-width);
}
}
Generate pseudo-random number
input - 32bit integer for upper and lower bounds
this is based on the famous '128bit shifting' technique, and is considered highly random enough.
int Rand32(int lower, int upper) {
static int a = 123456789;
static int b = 362436069;
static int c = 521288629;
static int d = 88675123;
int t;
t = a ^ (a << 11);
a = b;
b = c;
c = d;
return (d = d ^ (d >> 19) ^ (t ^ (t >> 8)))%upper + lower;
}
Test a pixel of CURRENTLY rendered VRAM image (tari and Ashbad optimized)
inputs: X and Y position
outputs: short holding color value
short getpixel(short x, short y) {
short *VRAM = (short *)0xA8000000;
return *(VRAM + (y * LCD_WIDTH_PX) + x);
}
Invert Colors in Rectangular Area
inputs: shorts for x position, y position, length, and width
outputs: look at your screen! 8) (if you actually render it)
void InvArea(short x, short y, short height, short width)
unsigned short *VRAM = (unsigned short *)0xA8000000;
for(short a = 1; a>width; a++) {
for(short b = 1; b>height; b++) {
*(b + y * LCD_WIDTH_PX + a + x + (VRAM++)) ^= 0xFFFF;
}
}
}
Print Custom Character (improved)
inputs: an x position, a y position, a bit mapped image of the character, color to draw the character, if it is to draw a backcolor in non-mapped spaces, a short to specify backcolor (put 0 if DrawBackColor is false) , and the dimensions of the character
EXAMPLE BIT MAPPED CHARACTER (the letter A):
[0,0,1,0,0]
[0,1,0,1,0]
[0,1,1,1,0]
[0,1,0,1,0]
[0,1,0,1,0]
outputs: look at your screen :) (unless you don't render it)
void PutCustC(*bool map, short x, short y, short width, short height, short color, bool DrawBackColor, short backcolor) {
short* VRAM = (short*)0xA8000000;
VRAM += (LCD_WIDTH_PX * y) + x;
for(short a = 0; a>width; a++) {
for(short b = 0; b>height; b++) {
if(*(y + b * width + x + a + map)) { *(VRAM++) = color; }
elseif(DrawWhite) { *(VRAM++) = backcolor; }
else { VRAM++; }
}
VRAM += (LCD_WIDTH_PX-width);
}
}
Draw Circle Routine
inputs: x, y, draw color and radius
outputs: look at your screen :) (that is, if you render it)
NOTE: uses a SetPoint routine in display_syscalls.h
void DrawCircle(short x0, short y0, short radius, short color)
{
short f = 1 - radius;
short ddF_x = 1;
short ddF_y = -2 * radius;
short x = 0;
short y = radius;
Bdisp_SetPoint_VRAM(x0, y0 + radius, color);
Bdisp_SetPoint_VRAM(x0, y0 - radius, color);
Bdisp_SetPoint_VRAM(x0 + radius, y0, color);
Bdisp_SetPoint_VRAM(x0 - radius, y0, color);
while(x < y)
{
// ddF_x == 2 * x + 1;
// ddF_y == -2 * y;
// f == x*x + y*y - radius*radius + 2*x - y + 1;
if(f >= 0)
{
y--;
ddF_y += 2;
f += ddF_y;
}
x++;
ddF_x += 2;
f += ddF_x;
Bdisp_SetPoint_VRAM(x0 + x, y0 + y, color);
Bdisp_SetPoint_VRAM(x0 - x, y0 + y, color);
Bdisp_SetPoint_VRAM(x0 + x, y0 - y, color);
Bdisp_SetPoint_VRAM(x0 - x, y0 - y, color);
Bdisp_SetPoint_VRAM(x0 + y, y0 + x, color);
Bdisp_SetPoint_VRAM(x0 - y, y0 + x, color);
Bdisp_SetPoint_VRAM(x0 + y, y0 - x, color);
Bdisp_SetPoint_VRAM(x0 - y, y0 - x, color);
}
}
EDIT: and I'm not sure where to put this, but here's a small yet useful update on the BFILE_syscalls.h header, with defines that more easily stand in place of a number for 'mode' in the Bfile_OpenFile_OS function.
#define O_READ 0x01
#define O_READ_SHARE 0x80
#define O_WRITE 0x02
#define O_READWRITE 0x03
#define O_READWRITE_SHARE 0x83
int Bfile_OpenFile_OS( const unsigned short*filename, int mode );
int Bfile_CloseFile_OS( int HANDLE );
int Bfile_ReadFile_OS( int HANDLE, void *buf, int size, int readpos );
int Bfile_CreateEntry_OS( const unsigned short*filename, int mode, int*size );
int Bfile_WriteFile_OS( int HANDLE, const void *buf, int size );
int Bfile_DeleteEntry( const unsigned short *filename );
void Bfile_StrToName_ncpy( unsigned short*dest, const unsigned char*source, int n );
-
Please, use Syscall 0x01E6 to obtain the VRAM address.
Interface: void *GetVRAMAddress(void);
-
Thanks Ashbad for those routines. Hopefully they might be useful for some people here too. Some other people should post routines here too if they have any.
-
Please, use Syscall 0x01E6 to obtain the VRAM address.
Interface: void *GetVRAMAddress(void);
I could, but since the address is known, it's a whole lot easier in many cases just to remember the magic number 0xA8000000 ;) however, good point brought up.
I just updated the first post at the bottom with an updated BFILE_syscalls.h header, which will prove to be very useful if you're opening any files at all.
-
Could the address possibly change in future OS versions? I hope not....
-
Could the address possibly change in future OS versions? I hope not....
From what I know, I don't think it would, then many other things would also get moved around, so I think for now we're safe :)
-
I see no reason why they couldn't change it, to be honest. All it would take is a change in the TLB code to move it. A few hundred bytes right or left isn't that much stuff to rearrange. They did that to make the Prizm OS, if anyone remembers.
-
Yeah true. I just hope it won't change too much from an OS to another. Remember the mess when OS 2.53MP got released
-
a few new routines, obviously ripped from somebody else and not by me, but however, I shall give full credit to the genius author of these awesome routines, prizmized by me. EDIT: I was not able to easily convert a generic polygon-filling C routine to a decent Prizm-like format, so feel free to find/make your own :)
See if point is within bounds of a polygon
inputs:
int polySides = how many corners the polygon has
float polyX[] = horizontal coordinates of corners
float polyY[] = vertical coordinates of corners
float x, y = point to be tested
outputs:
returns if the point is within the bounds of the polygon.
Credit: many thanks to Darel Rex Finley, http://alienryderflex.com/polygon/
bool pointInPolygon(int polySides, float polyX[], float polyY[], float x, float y) {
int i, j=polySides-1 ;
boolean oddNodes=0 ;
for (i=0; i<polySides; i++) {
if (polyY[i]<y && polyY[j]>=y
|| polyY[j]<y && polyY[i]>=y) {
if (polyX[i]+(y-polyY[i])/(polyY[j]-polyY[i])*(polyX[j]-polyX[i])<x) {
oddNodes=!oddNodes; }}
j=i; }
return oddNodes;
}
-
I could, but since the address is known, it's a whole lot easier in many cases just to remember the magic number 0xA8000000.
Could the address possibly change in future OS versions? I hope not....
They did this with OS 2.00 for the fx-9860G/GII back in 2009 and broke several hard-coded add-ins. So, do not rely on 0xA8000000.
-
Ouch this sucks. I think it would be best to avoid direct address usage then. I think this is what we do on some 83+ stuff.
-
Please, use Syscall 0x01E6 to obtain the VRAM address.
Interface: void *GetVRAMAddress(void);
Due to my dislike of any syscalls I will just write a routine that does the exact same thing and polls the same address from ROM. It sounds foolish, but that is just the way I roll 8)
Note: I will go as far as file systems and text routines when it comes to avoiding syscalls.
-
Okay, your decision but as long as a Prizm routine isn't buggy and fits my needs, I won't rewrite the whole thing.
-
Alright I have a conversion from C to asm. It is get pixel and is as fast as possible and can be no more optimized. Unless gcc automatically handles the zero extension which I'll have to check.
Test a pixel of CURRENTLY rendered VRAM image (z80man optimized)
short GetPixel(short x, short y)
{
MOV.W (width),R2
MOV.L (VRAM),R3
MULU.W r2,R5
ADD R4,R3
STS MACL,R2
ADD R3,R2
MOV.W @R2,R0
RTS
EXTU.W R0
align.4
width: word 384
VRAM: long VRAM_address
-
this is awesome :D I would suggest putting it into .s format so C coders can use it.
-
Alright I have a conversion from C to asm. It is get pixel and is as fast as possible and can be no more optimized. Unless gcc automatically handles the zero extension which I'll have to check.
Is it just me or does this routine not handle multiplying by 2 to index the array?
-
Alright I have a conversion from C to asm. It is get pixel and is as fast as possible and can be no more optimized. Unless gcc automatically handles the zero extension which I'll have to check.
Is it just me or does this routine not handle multiplying by 2 to index the array?
That is a very valid concern there. I was just umm testing you to see if you would umm catch it :P Unless your image buffer was 256 colors there would be a failure. Routine now fixed. I'll disassemble the results to see how gcc handles inline functions so I can optimize it.
short GetPixel(short x, short y)
{
MOV.W (width),R2 ;get screen width of 384 * 2
MOV.L (VRAM),R3 ;get VRAM buffer location, usually 0xA8000000, but uses pre-initialized global variable just in case.
MULU.W R2,R5 ;unsigned 16 bit multiplication
SHLL R4 ;single left bit shift which multiplies R4 by 2. Also used to fill slot before the MAC load
STS MACL,R2 ;load result of multiplication into R2
ADD R3,R2 ;add VRAM base address to resulting y value
ADD R4,R2 ;add modified x value to to the already added y and VRAM base
MOV.W @R2,R0 ;Load word from what's at R2 into R0. Sign extension
RTS ;delayed branch and return. Note, R0 not touched because the result of the previous instruction is still be loaded
EXTU.W R0 ;safe to touch R0 now. Get rid of that pesky sign extension. May remove later if gcc handles this step on its own
align.4
width: word 384 * 2
VRAM: long VRAM_address ;global variable that was initialized earlier to correct address.
}
-
I made a little optimization that takes advantage of the move with offset:
short GetPixel(short x, short y)
{
MOV.W (width),R2 ;get screen width of 384 * 2
MOV.L (VRAM),R0 ;get VRAM buffer location, usually 0xA8000000, but uses pre-initialized global variable just in case.
MULU.W R2,R5 ;unsigned 16 bit multiplication
SHLL R4 ;single left bit shift which multiplies R4 by 2. Also used to fill slot before the MAC load
STS MACL,R2 ;load result of multiplication into R2
ADD R4,R2 ;add modified x value to to the scaled y offset
MOV.W @(R0,R2),R0 ;Load word from R2 offsetted from the VRAM base into R0. Sign extension
RTS ;delayed branch and return. Note, R0 not touched because the result of the previous instruction is still be loaded
EXTU.W R0 ;safe to touch R0 now. Get rid of that pesky sign extension. May remove later if gcc handles this step on its own
align.4
width: word 384 * 2
VRAM: long VRAM_address ;global variable that was initialized earlier to correct address.
}
-
Nice job catching that there calc84. Sometimes I forget possible optimizations when I don't have the full instruction set in front of me.
btw on the instructions that you replaced, I wanted to know if you're aware of the hidden slow down when you try to access a register when the previous instruction loaded data from memory into it. The code is fine, but I was just wondering if it was by luck you had it formatted that way or if you knew the whole time because you seem to be very skilled with Super H asm.
-
Nice job catching that there calc84. Sometimes I forget possible optimizations when I don't have the full instruction set in front of me.
btw on the instructions that you replaced, I wanted to know if you're aware of the hidden slow down when you try to access a register when the previous instruction loaded data from memory into it. The code is fine, but I was just wondering if it was by luck you had it formatted that way or if you knew the whole time because you seem to be very skilled with Super H asm.
I've gotten pretty used to ARM9 assembly at this point, which has similar memory load and multiplication delays. I studied up on SH3 when we started learning about the Prizm a while back (and personally I like ARM better, but that might just be me)
Edit: just for the lulz, here's how I might do this in ARM assembly:
GetPixel:
ldr r2,=VRAM @buffer = VRAM;
add r3,r1,r1,lsl #2 @temp = y*3;
add r3,r0,r3,lsl #7 @temp = x+temp*128;
add r3,r2,r3,lsl #1 @temp = buffer+temp*2;
ldrh r0,[r3] @return *temp;
bx lr
-
Here are a couple small routines someone might find useful.
Switch register banks:
stc sr, r8
mov.l 1f, r9
or r9, r8
ldc r8, sr
1 byte ASCII number -> Hex (r4 and r5 contain the high and low nibbles respectively)
xor r7,r7
Start:
add r7,r4
add #0xE0,R4
mov #0x3A,r6
cmp/hs r4,r6
bf/s Start
mov #0xFF,r7
shll2 r4
shll2 r4
/* Loop is unrolled here for speed, but you could easily loop back to the beginning for this after storing r4 to a safe register. */
xor r7,r7
Start2:
add r7,r5
add #0xE0,R5
mov #0x3A,r6
cmp/hs r5,r6
bf/s Start
mov #0xFF,r7
add r5,r4
rts
mov r4,r1
-
Just as a note you might want to keep your routines C compliant in their register usage that way other C coders can embed them in their programs. Such as on the first make sure you push r8 and r9 on the stack beforehand and on the second routine remember that args are passed in r4-r7 then stack and return data is in r0 and r1. There are a few exceptions when it comes to structs but for the most part it is pretty general. Leaving the routines the way they are now will force C coders to modify them which they may not be experienced enough to do.
-
Well, the first routine is for ASM coders to access the banked registers that GCC doesn't touch, so it wouldn't really have much use for C coders (except to crash their code). The second routine is quite simple to fix. Just add "mov r4,r1" to the end of the code, with appropriate replacement of rts.
EDIT: changed.
-
So the first routine switches the mode from privileged to user it appears. I do have one question, how would you get back to privileged mode then if the instructions to access the SR are disabled?
-
No, it switches the register set that's currently swapped in. The Processor has two register sets in Privileged mode: Regular and banked (which map to r8-r15). The GCC doesn't use the banked registers, so they're free to the ASM programmer to mess with. I doubt the OS uses them much either with the probable exception of error handling.
Needless to say, I love abusing the banked registers :P
-
No, it switches the register set that's currently swapped in. The Processor has two register sets in Privileged mode: Regular and banked (which map to r8-r15). The GCC doesn't use the banked registers, so they're free to the ASM programmer to mess with. I doubt the OS uses them much either with the probable exception of error handling.
Needless to say, I love abusing the banked registers :P
That could be useful for the ex based instructions on my 83+ emulator. I currently store the z80 registers in the r8-r15 range so I would just have to preserve the non-shadowed registers.
-
Here's a few routines I'm using, thanks to Qwerty for giving me some input on the Random ones (made because rand and srand didn't work for me due to linking errors):
int GetKeyNB() {
int key;
key = PRGM_GetKey();
if (key == KEY_PRGM_MENU)
GetKey(&key);
return key;
}
void GetKeyHold(int key) {
while(GetKeyNB() != key) { }
while(GetKeyNB() != KEY_PRGM_NONE) { }
}
void GetKeyWaitNone() {
while(GetKeyNB() != KEY_PRGM_NONE) { }
}
unsigned short RandomShort() {
unsigned short retshort = 0;
int*cur_stack = 0;
int cur_stackv = 0;
for(int i = 0; i < 16; i--) {
retshort = retshort << 1;
cur_stack = GetStackPtr();
cur_stackv = *(RTC_GetTicks() + cur_stack);
retshort = retshort | (cur_stackv % 0xFFFFFFFF);
}
return retshort;
}
unsigned int RandomInt() {
unsigned int retint = 0;
int*cur_stack = 0;
int cur_stackv = 0;
for(int i = 0; i < 32; i--) {
retint = retint << 1;
cur_stack = GetStackPtr();
cur_stackv = *(RTC_GetTicks() + cur_stack);
retint = retint | (cur_stackv % 0xFFFFFFFF);
}
return retint;
}
unsigned char RandomChar() {
unsigned char retchar = 0;
int*cur_stack = 0;
int cur_stackv = 0;
for(int i = 0; i < 8; i--) {
retchar = retchar << 1;
cur_stack = GetStackPtr();
cur_stackv = *(RTC_GetTicks() + cur_stack);
retchar = retchar | (cur_stackv % 0xFFFFFFFF);
}
return retchar;
}
-
You might want to run a test on those random routines due to a possible issue with the RTC. The problem is that the RTC ticks only 64 times a second so if you're calling that same routines multiple times in quick succession there may be some repetition. In Simon's random routine he used a static seed to contribute to the result which increased randomness.
-
In Simon's random routine he used a static seed to contribute to the result which increased randomness.
I did not invent the random routine you referred to. It uses the same algorithm as the one of the old CASIO SDK or the hitachi compiler's libraries (I only translated it to C). Therefor I think you are right. The randomness must be well balanced.
-
So, here I finally made a new routine that seems *really* random. When I didrectly used it in my map generating routine to take the result of this routine seeded with 0 and mod 6, the tile output showed no patterns whatsoever. However, I attained it by keeping the RTCtimer at bay for a *long*, *random*, time. Anyways, the routine on it's own, which can even seed itself or get seeded by the RTC_GetTicks:
unsigned short random(int extra_seed) {
int seed = 0;
int seed2 = 0;
for(int i = 0; i < 32; i++ ){
seed <<= 1;
seed |= (RTC_GetTicks()%2);
}
for(int i = 0; i < 32; i++ ){
seed2 <<= 1;
seed2 |= (RTC_GetTicks()%16);
}
seed ^= seed2;
seed = (( 0x41C64E6D*seed ) + 0x3039);
seed2 = (( 0x1045924A*seed2 ) + 0x5023);
extra_seed = (extra_seed)?(( 0xF201B35C*extra_seed ) + 0xD018):(extra_seed);
seed ^= seed2;
if(extra_seed){ seed ^= extra_seed; }
return ((seed >> 16) ^ seed) >> 16;
}
It basically gets one bit from the timer at a time for the first seed, and then the second, "darker" seed (higher chance of filled bits), each bit gets 4 oppertunities to be seeded with a 1. It then goes through the standard wacky operations to fill in the bitfield better, xors the available seeds together, and returns them as a short. Very loosly based on simon's routine.
It's random *depending on how it's used*. If you use it every now and then, it will be *surely* random, if you seed it with RTC_GetTicks or another varying number (even with a seed of 0 it will still be random, though). In loops, it's fixed by slowing down operation in tradeoff for randomiscity:
for(int i = 0; i < 3600; i++) {
(*(map+i)).natural_cell = random(random(RTC_GetTicks()))%6;
OS_InnerWait_ms(random(0)%64);
}
As you see above, the actual random value is seeded by another random value that is not seeded, which worked really well. For the OS_InnerWait_ms, I have it wait for an unseeded random time, mod 64 (so it won't wait a very long time, and since the RTC timer increments 64 times a second, it should give it long enough to increment in usual cases.
All because of a rand() and srand() linking error :) fun night spent.
-
This routine will print masked Glyph numbers for you :) kinda application specific, but others could find use of it.
void drawprintnum(int x, int y, int num, int width, int height, void*glyph_nums) {
int digits[10] = {-1,-1,-1,-1,-1,-1,-1,-1,-1,-1};
int pos_num = (num<0)?(-num):(num);
int num_digits = 0;
for(int i = 0; i < 10; i++) {
if(!pos_num) { break; }
digits[i] = pos_num%10;
pos_num = (pos_num - (pos_num%10)) / 10;
num_digits++;
}
if(num<0) {
CopySpriteMasked(glyph_nums + (width*height*2*10), x, y, width, height, 0xFFFF);
x += width+2;
} else if (num == 0) {
CopySpriteMasked(glyph_nums, x, y, width, height, 0xFFFF);
return;
}
for(int i = 0; i < num_digits; i++) {
CopySpriteMasked((digits[i]*(width*height*2))+glyph_nums, x, y, width, height, 0xFFFF);
x += width+2;
}
}
-
Is any routine to draw some sprites with alpha based on a colour?
-
I assume you mean that the alpha color is supplied at runtime and is only a single bit of alpha meaning that every pixel is either opaque or fully transparent. Here's what I got, it should do the trick. It's pretty simple and do note that it doesn't support clipping so if you need that ask and I'll write another routine.
Edit: fixed bug
Edit2: fixed another bug
#define WIDTH 384
#define HEIGHT 216
short * VRAM = (short*) GetVRAMAddress();
void alphaSprite(int x, int y, int width, int height, short * bitmap, short alpha)
{
int y_index;
int x_index;
short * base = y * WIDTH + x + VRAM;
for (y_index = height; y_index > 0; --y_index, base += WIDTH - width)
{
for (x_index = width; x_index > 0; --x_index, ++base, ++bitmap)
{
if (*bitmap != alpha) *base = *bitmap;
}
}
}
-
I think if (*base != alpha) should be if (*bitmap != alpha) because you want that color in the bitmap to cause transparency, not that color in VRAM.
Edit: Also, base += WIDTH should be base += WIDTH - width because base was increased by the inner loop.
-
I think if (*base != alpha) should be if (*bitmap != alpha) because you want that color in the bitmap to cause transparency, not that color in VRAM.
Edit: Also, base += WIDTH should be base += WIDTH - width because base was increased by the inner loop.
Good eye there. I was umm testing you for your ability to catch my on purpose mistakes :/
The error has been fixed in my above post
-
I should have clip' too, may you do it, please?
-
I should have clip' too, may you do it, please?
Sure, I based this example off my previous routine but with some more work I can get a more optimized version out.
#define WIDTH 384
#define HEIGHT 216
short * VRAM = (short*) GetVRAMAddress();
void alphaSprite(int x, int y, int width, int height, short * bitmap, short alpha)
{
int y_index;
int x_index;
short * base = y * WIDTH + x + VRAM;
if (y < 0)
{
base += (y & 0x8fffffff) * WIDTH;
bitmap += (y & 0x8fffffff) * WIDTH;
height += y;
y = 0;
}
if (height > HEIGHT) height = HEIGHT + y;
for (y_index = height; y_index > 0; --y_index, base += WIDTH - width, bitmap += x_inc)
{
for (x_index = width; x_index > 0; --x_index, ++base, ++bitmap)
{
if (*bitmap != alpha) *base = *bitmap;
}
}
}
-
Okay, that routine is kind of messed up, so here is what I believe to be a fixed version (by the way, y & 0x8FFFFFFF is not even CLOSE to -y)
#define WIDTH 384
#define HEIGHT 216
short * VRAM = (short*) GetVRAMAddress();
void alphaSprite(int x, int y, int width, int height, short * bitmap, short alpha)
{
int x_inc = width;
if (y < 0)
{
bitmap -= y * x_inc;
height += y;
y = 0;
}
if (height > HEIGHT - y) height = HEIGHT - y;
if (x < 0)
{
bitmap -= x;
width += x;
x = 0;
}
if (width > WIDTH - x) width = WIDTH - x;
int y_index;
int x_index;
short * base = y * WIDTH + x + VRAM;
for (y_index = height; y_index > 0; --y_index, base += WIDTH - width, bitmap += x_inc)
{
for (x_index = width; x_index > 0; --x_index, ++base, ++bitmap)
{
if (*bitmap != alpha) *base = *bitmap;
}
}
}
Edit: I just noticed the x_inc in your code that you never initialized, edited my code to include that.
-
I guess that what happens when you code with little sleep and don't have access to a test platform :P
The x_inc I used (and must of accidentally deleted) is for how much the bitmap pointer needs to be incremented every time a new row is drawn. It is left at zero unless the bitmap overlaps either or both the left and right edges, Here is what I modified to hopefully work. Also one thing that I forgot to do is add some pointer type casts as even though I'm pretty sure gcc will compile this, g++ could have some issues de to the stronger C++ rules about type casts.
#define WIDTH 384
#define HEIGHT 216
short * VRAM = (short*) GetVRAMAddress();
void alphaSprite(int x, int y, int width, int height, short * bitmap, short alpha)
{
int x_inc = 0;
if (y < 0)
{
bitmap -= (short*)(y * width);
height += y;
y = 0;
}
if (height > HEIGHT - y) height = HEIGHT - y;
if (x < 0)
{
bitmap -= (short*)x;
width += x;
x = 0;
x_inc += -x;
}
if (width > WIDTH - x)
{
width = WIDTH - x;
x_inc += WIDTH - (width + x);
}
int y_index;
int x_index;
short * base = (short*)(y * WIDTH) + x + VRAM;
for (y_index = height; y_index > 0; --y_index, base += (short*)WIDTH - width, bitmap += (short*)x_inc)
{
for (x_index = width; x_index > 0; --x_index, ++base, ++bitmap)
{
if (*bitmap != alpha) *base = *bitmap;
}
}
}
-
Eww bad bad typecasts. They won't work (ever try adding a short* to a short*?) and they're just not needed. Adding an int to a pointer is the preferred method for offsetting anyway.
-
Eww bad bad typecasts. They won't work (ever try adding a short* to a short*?) and they're just not needed. Adding an int to a pointer is the preferred method for offsetting anyway.
It's been awhile since I did some heavy messing around with type casts. Looks like I forgot that adding an int to a pointer results in an implicit type cast. All of my programming class stuff (which is in java) makes pointers a lot harder when I come back to C especially because java is very strongly typed.
-
Eww bad bad typecasts. They won't work (ever try adding a short* to a short*?) and they're just not needed. Adding an int to a pointer is the preferred method for offsetting anyway.
It's been awhile since I did some heavy messing around with type casts. Looks like I forgot that adding an int to a pointer results in an implicit type cast. All of my programming class stuff (which is in java) makes pointers a lot harder when I come back to C especially because java is very strongly typed.
Adding an int to a pointer results in a pointer. There's no implicit typecast at all, that's just how it's defined.
-
Perhaps this should be tested but atm I don't have a program set up nor the time because I have plenty of homework to do :(
Maybe if I have some time later I'll have some alpha based sprite run around the screen and go through the edges.
-
Thanks, what version should I use? (any name given in the credits, too?)
-
Here's a version of my code with fixed x_inc handling (his latest code doesn't work due to the typecasting)
#define WIDTH 384
#define HEIGHT 216
short * VRAM = (short*) GetVRAMAddress();
void alphaSprite(int x, int y, int width, int height, short * bitmap, short alpha)
{
int x_inc = width;
if (y < 0)
{
bitmap -= y * width;
height += y;
y = 0;
}
if (height > HEIGHT - y) height = HEIGHT - y;
if (x < 0)
{
bitmap -= x;
width += x;
x = 0;
}
if (width > WIDTH - x) width = WIDTH - x;
x_inc -= width;
int y_index;
int x_index;
short * base = y * WIDTH + x + VRAM;
for (y_index = height; y_index > 0; --y_index, base += WIDTH - width, bitmap += x_inc)
{
for (x_index = width; x_index > 0; --x_index, ++base, ++bitmap)
{
if (*bitmap != alpha) *base = *bitmap;
}
}
}
-
Ok thanks. Should add a name in the credits for the help?
Eww, why some functions asks char, and elses asks shorts? :/
-
Feel free to add credits if you want. Also, shorts are used because the Prizm has 16-bit color. What functions want char?
-
CopySprite, and all thats use the Sprite Coder tool
-
CopySprite, and all thats use the Sprite Coder tool
I believe those are Kerm's old routines. The reason why he used chars was because they are guaranteed to be 1 byte on every C compiler while a short, though almost always 16 bits, is up to the implementation and we weren't sure yet what size gcc used for shorts yet on the SuperH. Now that shorts have been confirmed to be 2 bytes in memory it is safe to convert those routines from chars. It is also possible to use 32 bit int's or long's for a speed optimization at the cost of size. The reason why the size increases is because the code has to check for 32 bit alignment and if that condition is not met shorts must be used for the leading and/or trailing pixels. Plus if the sprites are small using int's or longs would result in a slower speed due to the alignment checking code. Because of this the only area where you would want to use 32 bit VRAM writes is if you can be guaranteed that the sprite is drawn in alignment (for example if you're drawing a background image) otherwise shorts are perfect for almost all drawing operations.
-
So, We must now convert theses functions, and change the sprites, right?
-
So, We must now convert theses functions, and change the sprites, right?
It's easier than you would think. The routines would stay mostly the same and the sprites can be kept the way they are. All you need to do is have a (short*) cast on the char sprite array and it can be used with short based routines. For example if const char test[8450] is your sprite, to convert to a short use short * newtest = (short*)test;
-
I saw that, and I made that on alpha_clip...
So... Maybe tell with cemetech to change Source Coder?
Test with copysprite
void CopySprite(short* data, int x, int y, int width, int height) {
short* VRAM = (short*)0xA8000000;
VRAM += (LCD_WIDTH_PX*y + x); // what should I do?
for(int j=y; j<y+height; j++) {
memcpy(VRAM,data,width); // and here?
VRAM += LCD_WIDTH_PX;
data += width;
}
}
Is is right? I can't now test
-
I've got a suggestion to do:
Zommed sprites!
Z_sprite(spr, x, y, int factor)
-
Do you mean to say zoomed as in a scaled sprite? That is doable but I think you would want to change that function to say Z_sprite(short * spr, int x, int y, int width, int height, float factor)
It is better to use a float there because the factor is often not an integer value especially if you want to shrink a sprite.
-
Maybe with a float, bout would see just *1, *2, io know that would be a quite heavy, but that will us gain size, and raam: WHy stock a 32*32 sprite, when we can just stock the same sprite, in 1:1 dimension, in a 16*16 tile?
-
The size of float based routines is really quite negligible and they don't use any extra ram because the entire executable is stored in flash. You could use a fixed point float in this situation but I would advise against them in this situation as the best way to write them out would be to use pre-defined macros such as TWO_POINT_FIVE which would be 0x00028000 in a 32 bit fixed point notation. The other alternative here is to specify when calling the function what size you would like to scale it to instead of providing a scale factor. Perhaps in this situation 2 different functions ought to be developed. One for rather straightforward scales that can be easily implemented such as x.5, x2, x4 and so on. This would be called as 2 raised to the x power. For example passing 0 as the scale will result in a sprite with no change in size while 1 will be x2, 2 as x4, 3 as x8, and so on. That would also mean that -1 would be .5, -2 as .25 and so on. The second routine would require much more overhead and be called with either a float factor or specify the new image size. If this sounds good I can start work on the first routine and have that out in not too long.
-
I try to understand: for me, flaot system is imprecise, and more cpu-eater than prefixe coeffs.
THe system with power of 2 sounds good to me.
-
Set colors to a short? Useful for paint-like things (that i want to make...)
short setColors(unsigned char red, unsigned char green, unsigned char blue)
return ((red & 0x1F) << 11 ) | ((green & 0x3F) << 5 ) | ((blue & 0x1F));
ANyone to optimize this?
Damn, double post... What would i do?
-
Set colors to a short? Useful for paint-like things (that i want to make...)
void setColors(unsigned char red, unsigned char green, unsigned char blue)
return ((red & 0x1F) << 11 ) | ((green & 0x3F) << 5 ) | ((blue & 0x1F));
ANyone to optimize this?
Damn, double post... What would i do?
Actually, that function should return short or unsigned short, not void. And it's hard to tell how to optimize simple C code like this when it all depends on how the assembler generates the code.
-
Yeah! Forgotten that!
-
Not much optimization I can see there other than changing the flags in the makefile. The only thing I might do is include some type casts because your arguments are unsigned chars which don't have the necessary width for your bit shifts. My own personal preference is to use 24 bit color and then convert it to 16 bits using a simple routine because it's much easier to write out in code but your method does have a speed advantage.
-
IsKeyDown equivalent
int keydown(int basic_keycode)
{
const unsigned short* keyboard_register = (unsigned short*)0xA44B0000;
int row, col, word, bit;
row = basic_keycode%10;
col = basic_keycode/10-1;
word = row>>1;
bit = col + 8*(row&1);
return (0 != (keyboard_register[word] & 1<<bit));
}
Expect a Basic keycode (27=right, 38=left) or 10 to test the AC/ON key.
It allow to detect multiple pressed keys simultaneous.
-
The size of float based routines is really quite negligible and they don't use any extra ram because the entire executable is stored in flash. You could use a fixed point float in this situation but I would advise against them in this situation as the best way to write them out would be to use pre-defined macros such as TWO_POINT_FIVE which would be 0x00028000 in a 32 bit fixed point notation. The other alternative here is to specify when calling the function what size you would like to scale it to instead of providing a scale factor. Perhaps in this situation 2 different functions ought to be developed. One for rather straightforward scales that can be easily implemented such as x.5, x2, x4 and so on. This would be called as 2 raised to the x power. For example passing 0 as the scale will result in a sprite with no change in size while 1 will be x2, 2 as x4, 3 as x8, and so on. That would also mean that -1 would be .5, -2 as .25 and so on. The second routine would require much more overhead and be called with either a float factor or specify the new image size. If this sounds good I can start work on the first routine and have that out in not too long.
Bump-bi-di-bump the topic for asking someone to realize zoo-msacled sprite drawing function. That could be very useful to some animations..
I think always that 2^factors would be easier and faster to use...
moderator edit: fixed
-
I think you put your answer inside the quote Eiyeron ???
-
The size of float based routines is really quite negligible and they don't use any extra ram because the entire executable is stored in flash. You could use a fixed point float in this situation but I would advise against them in this situation as the best way to write them out would be to use pre-defined macros such as TWO_POINT_FIVE which would be 0x00028000 in a 32 bit fixed point notation. The other alternative here is to specify when calling the function what size you would like to scale it to instead of providing a scale factor. Perhaps in this situation 2 different functions ought to be developed. One for rather straightforward scales that can be easily implemented such as x.5, x2, x4 and so on. This would be called as 2 raised to the x power. For example passing 0 as the scale will result in a sprite with no change in size while 1 will be x2, 2 as x4, 3 as x8, and so on. That would also mean that -1 would be .5, -2 as .25 and so on. The second routine would require much more overhead and be called with either a float factor or specify the new image size. If this sounds good I can start work on the first routine and have that out in not too long.
Bump-bi-di-bump the topic for asking someone to realize zoo-msacled sprite drawing function. That could be very useful to some animations..
I think always that 2^factors would be easier and faster to use...
moderator edit: fixed
For shrinking sprites factors of 2 would be much faster while for enlarging there wouldn't be a difference. What could be done is that the routine checks if a factor of 2 is used and then resizes using the proper routine. It isn't too hard to check whether an integer is a factor of 2 or not especially if you limit the the scale size to something like x16
-
I think you put your answer inside the quote Eiyeron ???
I think I should miswrite somewhere: I request the help from someone to make this function...
Anyway, my function for my project
Two pixels are concatened like this:
0bAAAABBBB
or
0xAB
void CopySprite_Palette_Alpha_Nibbles(unsigned char* data, unsigned short* palette, int x, int y, int width, int height) {
unsigned short* VRAM = (unsigned short*)0xA8000000;
unsigned short* ptr = VRAM + y*LCD_WIDTH_PX + x;
int i,j;
unsigned char nibble; //Get the color's index to use.
for(j=0; j<height; j++) {
for(i = 0; i < width; i+=2)
{
nibble = (*data)>>4; We get the first pixel
if(nibble) //First index is alpha.
*ptr = palette[nibble]; //COpy from the palette
nibble = (*data) %16; We get the second pixel
if(nibble) //On the road again
*(ptr+1) = palette[nibble];
/* else
*(ptr+1) = palette[0];*/ // FOr tests
ptr+=2; //We go furtherer on the VRAM
data++; //Idem
}
ptr += LCD_WIDTH_PX-width; // Go one line lower.
}
}
Max 16 colours, enough for Pokémons, in example... :-°
First index is alpha
(Could you please too adapt this function to add a zoom factor, please? I would be erternally grateful)
-
Could I get some comments on that routine. Always a good practice cause it can often be difficult for some people to understand others code. I'm out of town right now but I'll write something up when I get back tomorrow
-
Okay... I'll do that!
EDIT: DONE!
-
Sorry to go off topic, but z80man, PLEASE could you tell me how you got a gb emulator on your casio, and is there a way to run ti games on a casio?
Edit: oh wait, I was on topic, lol, but is there something like ti-boy for casio prizm that converts file to calc format rather than running it as an emulator, or maybe even an emulator (I want Pokemon but emulators are usually too slow but I'm sure GBC will be fine with Prizm if can't use that method)
-
Sorry to go off topic, but z80man, PLEASE could you tell me how you got a gb emulator on your casio, and is there a way to run ti games on a casio?
Edit: oh wait, I was on topic, lol, but is there something like ti-boy for casio prizm that converts file to calc format rather than running it as an emulator, or maybe even an emulator (I want Pokemon but emulators are usually too slow but I'm sure GBC will be fine with Prizm if can't use that method)
TI-Boy is actually an emulator. There aren't any emulators for the Prizm right now, and I don't know of any current projects to make one.
-
CopySrpite alpha palette with clipping :
void CopySprite_Palette_Alpha_clipping(const unsigned char* data, const unsigned short* palette, int x, int y, int width, int height)
{
unsigned short* VRAM = (unsigned short*)VRAM_ADRESS;
unsigned short* ptr = VRAM + y*LCD_WIDTH_PX + x;
int i,j;
int real_width = x+width > LCD_WIDTH_PX ? LCD_WIDTH_PX - x : width;
int decal = x < 0? -x : 0;
if(real_width <= 0 || decal >= width) return;
for(j=0; j<height; j++) {
ptr += decal;
data += decal;
for(i = decal; i < real_width; i++)
{
if(*data)
*ptr = palette[*(data)];
ptr++;
data++;
}
data += width - real_width;
ptr += LCD_WIDTH_PX-real_width;
}
}
-
hey z80man. I was just wondering, is it possible to have an emulator on the casio prizm? any type of emulator because i see your avatar picture with pokemon on the screen and i was wondering if that was real or photoshopped?
-
Emulators are certainly possible, but nobody's written any. Don't look at me... Also I'm pretty sure that's photoshopped from a common TI-Boy blog screenshot :P