Omnimaga
Calculator Community => TI Calculators => Calculator C => Topic started by: fb39ca4 on June 16, 2011, 02:43:01 pm
-
For the contest, I am doing this thing that involves a raycasting. I would just use bwang's engine, but I'm guessing it would hurt my originality score, and it is kind of slow with all the features he added and the fact it uses floating point math. So, I'm writing my own, with fixed point math and less features, but I can't wrap my head around fixed point division. Can someone explain it to me? I'm using a 16.16 format, if it matters.
-
You're using a 16.16 format? That seems a bit overkill, and it even makes adding/subtracting hard, right?
Edit: OHWAIT this is for the Nspire contest. durrrr
Edit2:
I think something like this is how it's done:
int quotient = ((long long)dividend << 16)/divisor;
-
Thanks!
Also, just want to make sure, am I doing multiplication right?
product = (a >> 8) * (b >> 8)
product = (a >> 8 ) * (b >> 8 )
-
Thanks!
Also, just want to make sure, am I doing multiplication right?
product = (a >> 8) * (b >> 8)
If you want a fully accurate answer, I think you'd need ((long long)a * b) >> 16
Hmm, maybe I should write some ARM ASM routines for fixed-point multiplication/division. Doing it normally through GCC is going to be a bit inefficient.
Edit:
Just to confirm, are you using signed values?
-
But the way I did it would prevent overflows, because I am only using 32 bit integers the whole time. In my case, not getting the integer part right would be worse than getting a tiny part of the fractional part wrong.
Yes, I'm using signed numbers. Will that introduce problems?
Also, am I understanding the division correctly? a / b in fixed point is [integer part of a]/b?
Can you use 64 bit numbers on the Nspire?
Hey, it's my 888th post!
-
But the way I did it would prevent overflows, because I am only using 32 bit integers the whole time.
Yes, I'm using signed numbers. Will that introduce problems?
Also, am I understanding the division correctly? a / b in fixed point is [integer part of a]/b?
You can have overflow when multiplying 32-bit integers, if the absolute value of the result is more than 2^31. If you want, I could implement saturation in my routines (so anything outside of the range will be turned into the maximum or minimum value).
And the division is (a * 2^16)/b (this is to nullify the factor of 2^16 present in b)
-
Apparently ARM7 does not natively support long long integers. So that means I would get odd results if I tried division with any number with an absolute value of over 1.
Would (a << 8 ) / (b >> 8 ) work? I would still get an overflow if I used anything over 65536 in fixed point, but I don't think I will get to numbers that high.
Also, is there a tag to disable smileys?
-
Apparently ARM7 does not natively support long long integers. So that means I would get odd results if I tried division with any number with an absolute value of over 1.
Would (a << 8 ) / (b >> 8 ) work? I would still get an overflow if I used anything over 65536 in fixed point, but I don't think I will get to numbers that high.
Also, is there a tag to disable smileys?
No ARM processor natively supports 64-bit, but GCC should be able to generate code anyway.
A big problem with (a << 8) / (b >> 8) is that the top 8 bits of a are lost before the division. This means that numbers greater than 256 wouldn't work.
By the way, I've finished with a fixed-point multiplication routine now and I'm working on division. You should be able to put this in a .S file and include it in your makefile, and you can use it if you define the function prototype in your C code.
@ int fixed_mul(int, int);
.global fixed_mul
fixed_mul:
@ Multiply 32x32->64 and shift right by 16
smull r0,r2,r1,r0
mov r0,r0,lsr #16
orr r0,r0,r2,lsl #16
@ Overflow check start
teq r2,r0,asr #16
movne r0,#0x7FFFFFFF
eorne r0,r0,r2,asr #31
@ Overflow check end
bx lr
Edit: Here's the fixed-point division routine (note: neither of these routines have been tested yet)
@ int fixed_div(int, int);
.global fixed_div
fixed_div:
@ r2 = abs(r0)
eors r2,r0,r0,asr #32
adc r2,r2,#0
@ r3 = abs(r1)
eors r3,r1,r1,asr #32
adc r3,r3,#0
@ Get sign of result
eor r1,r1,r0
@ Initialize result
mov r0,#0x7FFFFFFF
@ Check for overflow
cmp r3,r2,lsr #15
bls fixed_div_end
.irp shift,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
cmp r3,r2,lsr #\shift
subls r2,r2,r3,lsl #\shift
bichi r0,r0,#0x10000 << \shift
.endr
.irp shift,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
rsbs r2,r3,r2,lsl #1
addlo r2,r2,r3
biclo r0,r0,#1 << \shift
.endr
fixed_div_end:
@ Adjust sign
eors r0,r0,r1,asr #32
adc r0,r0,#0
bx lr
Edit2: It's looking likely that the division routine doesn't work :(
Edit3: Replaced the division routine with one that might work (at least, the equivalent C code seems to work)
-
For my purposes, it will be fine, as I don't expect to have maps greater than 64 or so blocks, and 1 unit in my engine is a block.
-
I was getting major errors so I replaced my multiply and divides with the ones in C you mentioned above, using long longs, and now I am getting errors when I compile:
$ make
nspire-gcc -Os -Wall -W -c main.c
main.c: In function 'main':
main.c:52:7: warning: unused variable 'oldtime'
main.c:51:7: warning: unused variable 'time'
main.c:46:8: warning: unused variable 'timer'
nspire-gcc -Os -Wall -W -c utils.c
utils.c: In function 'filesize':
utils.c:129:3: warning: passing argument 2 of 'stat' from incompatible pointer type
c:/ndless-v2.0/sdk/include/os.h:288:1: note: expected 'struct stat *' but argument is of type 'char *'
nspire-gcc -Os -Wall -W -c graphics.c
nspire-gcc -Os -Wall -W -c math.c
nspire-ld main.o utils.o graphics.o math.o -o raycaster.elf
c:/program files/yagarto/bin/../lib/gcc/arm-none-eabi/4.5.1\libgcc.a(unwind-arm.o): In function `get_eit_entry':
C:\msys\1.0\home\yagarto\gcc-build\arm-none-eabi\libgcc/../../../gcc-4.5.1/libgcc/../gcc/config/arm/unwind-arm.c:614: undefined reference to `__exidx_start'
C:\msys\1.0\home\yagarto\gcc-build\arm-none-eabi\libgcc/../../../gcc-4.5.1/libgcc/../gcc/config/arm/unwind-arm.c:614: undefined reference to `__exidx_end'
c:/program files/yagarto/bin/../lib/gcc/arm-none-eabi/4.5.1/../../../../arm-none-eabi/lib\libc.a(lib_a-abort.o): In function `abort':
C:\msys\1.0\home\yagarto\newlib-build\arm-none-eabi\newlib\libc\stdlib/../../../../../newlib-1.18.0/newlib/libc/stdlib/abort.c:63: undefined reference to `_exit'
c:/program files/yagarto/bin/../lib/gcc/arm-none-eabi/4.5.1/../../../../arm-none-eabi/lib\libc.a(lib_a-signalr.o): In function `_kill_r':
C:\msys\1.0\home\yagarto\newlib-build\arm-none-eabi\newlib\libc\reent/../../../../../newlib-1.18.0/newlib/libc/reent/signalr.c:61: undefined reference to `_kill'
c:/program files/yagarto/bin/../lib/gcc/arm-none-eabi/4.5.1/../../../../arm-none-eabi/lib\libc.a(lib_a-signalr.o): In function `_getpid_r':
C:\msys\1.0\home\yagarto\newlib-build\arm-none-eabi\newlib\libc\reent/../../../../../newlib-1.18.0/newlib/libc/reent/signalr.c:96: undefined reference to `_getpid'
collect2: ld returned 1 exit status
make: *** [raycaster.tns] Error 1
-
The kill and getpid POSIX calls aren't referenced by your program, but by stuff internal to newlib, and stuff which shouldn't be linked with your program, at that...
Try the -nostdlib flag. A number of open-coded operations (e.g. 32-bit division) won't be accessible, but it may remove references to kill, getpid and friends.
And "utils.c:129:3: warning: passing argument 2 of 'stat' from incompatible pointer type + c:/ndless-v2.0/sdk/include/os.h:288:1: note: expected 'struct stat *' but argument is of type 'char *'" is a very important warning. Fix it, so that the incorrect generated code does not bite you at runtime.
-
Okay, I just removed the function from utils.c that gave the warning since I'm not using it anyways. I also tried using -nostdlib, and the compiler output looks normal, but I don't get a tns document. Also, not having 32 bit division will be an issue for me if I get -nostdlib working.
-
The error messages disappear when I don't use long long, so I set about writing a division routine that does not use long long.
Can someone confirm this works?
inline long div(long a, long b) {
long a1;
a1 = a >> 16;
a = (a << 16) / b;
a1 = (a1 << 16) / b;
return (a1 << 16) + a;
}
-
I have a multiplication routine which almost works. Basically, I isolate the upper and lower halves of a and b each to the bottom half of a new variable, use the FOIL method and add them together.
inline long mul(long a, long b) {
long a1 = a >> 16;
long a2 = a & 0x0000FFFF;
long b1 = b >> 16;
long b2 = b & 0x0000FFFF;
return ((a1 * b1) << 16) + (a1 * b2) + (a2 * b1) + ((a2 + b2) << 16);
}
I tried to multiply 0x00028000 and 0x0002000 (2 * 2.5), but I got 0x80050000, it seems that the sign bit gets messed up. What am I doing wrong?
-
I think your ((a2 + b2) << 16) needs to be ((a2*b2) >> 16). Actually, I think it might be a good idea to specify your variables there as unsigned short and signed short, like so (otherwise the ((a2*b2)>>16) might turn out badly):
inline long mul(long a, long b) {
signed short a1 = a >> 16;
unsigned short a2 = a;
signed short b1 = b >> 16;
unsigned short b2 = b;
return ((a1 * b1) << 16) + (a1 * b2) + (a2 * b1) + ((a2 * b2) >> 16);
}
-
Wow that was a fail.
*facepalm* *bangs head on table*
I never thought of using short ints in there. Thanks for suggesting that, now it works fine. This seems kind of inefficient, but it'll have to do for now, until i find something faster.
now, to work on division...
would this work?
a1 = upper half of a, a2 = lower half of a, in the same way as the multiplication routine
a / b = (((a1 << 16) / b) << 16 + ((a2 << 16) / b)?
Here's some C code:
inline long div(long a, long b) {
long a1 = a & 0xFFFF0000;
short a2 = a;
return ((a1 / b) << 16) + ((a << 16) / b);
}
With my testing, it seems to work okay, but can someone confirm it?
I really wish ARM processors just had a damn FPU now.
-
Watch out that the low 16-bit part should again be treated as unsigned (or else you'll get weird stuff when bit 15 is 1). Technically, I think that method still loses some precision in the ((a1 / b) << 16) when compared to a higher-precision divide such as ((long long)a1 << 16) / b), due to the fact that you're shifting in 16 zeros whereas the long long division actually calculates the low 16 bits.
If you're that worried about performance you can always use the assembly routines I posted (the division routine takes about the same number of steps as a normal 32-bit division). I understand if you don't want to use them because it's for a contest, though. Then again, I'd let anyone use this code anyway.
-
I would just use the routines with long long, but then I get compiler errors with newlib. Using your asm routines in a contest won't be a big deal, right? I might not have to use them, though, depending on how fast it runs.
-
I would just use the routines with long long, but then I get compiler errors with newlib. Using your asm routines in a contest won't be a big deal, right? I might not have to use them, though, depending on how fast it runs.
You'd have to ask the people in charge of the contest, I guess. I consider these routines to be like the built-in arithmetic operations except with fixed-point instead of integer. After all, the normal 32-bit division routine used by GCC was written by someone else, heh
-
True, and I guess I am using the getPixel routine and all the others that bwang made.
-
Yay, multiplication and division work now! :D
Now, I have to get the engine to stop dividing by zero or I will be responsible for many black holes created inside calcs. :mad:
-
Now that I think of it, shouldn't using unsigned short cause errors when I am multiplying negatives numbers, because say -0x0000.000F would be represented as 0x0000.FFF0 in the original number, and by using unsigned, the
computer calc wouldn't be aware its a negative number.
-
Actually, it's represented as FFFF.FFF0, which is FFFF.0000 (signed -1) plus 0.FFF0 (unsigned). That's why the upper 16 bits is treated as signed and the lower 16 bits is unsigned.
-
Oh, right. But anyways, wouldn't the fact that the bottom half is still signed, but not treated so in the multiplication routine mess stuff up?
I'm just going to do some experiments in visual studio and see what happens.
-
Oh, right. But anyways, wouldn't the fact that the bottom half is still signed, but not treated so in the multiplication routine mess stuff up?
That's the thing, the bottom half isn't signed. Really, what determines signedness is whether the top bit means the entire value is offset by some 2^N. Of course, only bit 0x80000000 of the whole 32-bit value determines this, which means that any parts of the value not including that bit are unsigned, and parts that include it are signed. You can think of signed values having their top bit repeated infinitely to the left.