This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Messages  Xeda112358
Pages: 1 2 [3] 4 5 ... 306
31
« on: March 29, 2019, 09:50:21 am »
I forgot to upload it here! With the recent updates to z80float, I updated the existing float routines and added in the rest.  sin( and cos( now have range reduction!
 division now has proper underflow/overflow detection
 Fixed a bug with the logarithm routines when the input's exponent was 0 (so on [1,2])
 Added in mean( to compute the mean of two numbers.
 I switched e^( to perform exp(), not 2^x (see note below)
 Added in 10^(, tan(, sinh(, cosh(, tanh(, sin^{1}(, cos^{1}(, tan^{1}(, sinh^{1}(, cosh^{1}(, tanh^{1}(.
I also changed the token hook so that e^( was no longer renamed 2^(. The regular routine still performs 2^n, but the float routine will compute exp(x).
32
« on: March 27, 2019, 11:05:08 pm »
Here is the latest version! I fixed the sine/cosine bugs and then found some more potential issues with the routine. Here is a screenshot of the cosine routine in action (via Grammer, but they are the same routines )
33
« on: March 27, 2019, 07:37:32 pm »
I now have 99 bytes left! I realized I could reuse a routine in the singleprecision squareroot routine that I was using in the extendedprecision one, so I saved a bunch on memory. I'm waiting to make another update until I can figure out an issue I noticed in Grammer with the sine and cosine routines It seems to be giving wacky results when the result is close to +1. Or maybe there is a glitch in the float > int routine
34
« on: March 26, 2019, 06:57:36 pm »
I am down to 3 bytes left in the app!
Thankfully it is just about finished. I broke some compatibility, but I don't think people are really using those parts yet. In particular, I removed 5 float constants used in constSingle, and I changed constSingle. The readme and inapp description differed on usage and I changed it to be that described inapp. So before, you would get a pointer to a float with ID 0 (pi) by doing:
call iconstSingle \ .db 0
But the inapp description says the arg is passed in A. So now it is:
ld a,0 call constSingle
Since this uses the z80float library, most of the other changes were inherited from that. This includes many bug fixes and optimizations (most notably the extendedprecision square root routine is much smaller and a little faster and divSingle detects underflow and overflow better). Finally, sinSingle and cosSingle apply range reduction.
And then there were the routines that were added:
xmod1 basically gets the noninteger part of the float mod1Single basically gets the noninteger part of the float xconst the extendedprecision variant of constSingle ti2single converts a TI float to a singleprecision float TItox converts a TI float to an extendedprecision float xtoTI converts an extendedprecision float to a TI float xcosh extendedprecision cosh xcos extendedprecision cos xsinh extendedprecision sinh xsin extendedprecision sin xtanh extendedprecision tanh xtan extendedprecision tan
And with that, I think I've included all of the routines that I originally intended! Now I just have to work on tidying up and finding and fixing bugs.
EDIT: Updated readme in the attachment
35
« on: March 26, 2019, 08:52:34 am »
There are legal issues involved with reverse engineering, especially if it risks "taking away someone's livelihood." If they need an ID then that means they probably compile a special version that can only be run on that calc (or calcs with those matching digits). That way people can't just post the binary files and everyone have access It is highly unlikely that their trial versions are full as it would be complicated to allow up to 100000000 different codes to "unlock" it. In that case, it might actually be a simple way to crack the security, but you would have to buy a copy first. Finally, please don't doublepost like that. You can just edit your post to add more
36
« on: March 25, 2019, 08:52:43 pm »
Actually, jp points to a fixed location whereas jr is relative. So in this case, jr $F1 (18F1) just states that it will jump back 15 bytes from the end of the instruction.
37
« on: March 25, 2019, 04:06:26 pm »
Sorry, I don't have my nspire on me, but those just look like partial files. What makes you think they are the same as the full version (without the key) ? I would think they'd offer just a simple trial version and give you the full download (if it even exists) after you pay. Personally, I don't think it's worth even the time to disassemble or crack based on this thread.
38
« on: March 25, 2019, 03:57:25 pm »
If you want those games you should make them
39
« on: March 25, 2019, 02:15:17 pm »
I don't know without personally inspecting it. I'm just assuming they are using TI's encryption, which does use RSA. I could be wrong; maybe they are using their own encryption.
40
« on: March 25, 2019, 01:41:43 pm »
I'll be honest, paid calc software just seems like a scam to me (not technically, but with tens of thousands of programs for free on ticalc.org...). If that's the only place to get it then you'll just have to make your own versions until somebody can crack the (probably) RSA encryption. At current stateoftheart, that'll take a trillionish years.
41
« on: March 24, 2019, 10:12:01 pm »
A simple way is to do something like: loop: bcall(_GetCSC) cp 15 ; check [clear] jr nz,loop ret
But if you want the more complicated (and less energy efficient ) way: di ;disables interrupts since the OS will mess with port 1 ld a,$FD ;we'll be polling for keys [ENTER] up to [CLEAR] out (1),a loop: in a,(1) and $40 ;checks bit 6 which corresponds to clear. Set if not pressed, reset if pressed jr nz,loop ret
But my preferred way is: ei ;keep OS interrupts active loop: halt ; ld a,(kbdScanCode) cp 15 jr nz,loop ret
42
« on: March 24, 2019, 05:59:36 pm »
Okay, but do you have a model and version number? On the back of your calc you'll see a serial number followed by something that looks like A1234B which is the part that gives us info.
It sounds to me like your physical calculator has one of the *really* slow LCDs, but ALCDFIX should have fixed that on your physical calc.
43
« on: March 24, 2019, 11:58:29 am »
32bit square root: sqrtHLIX: ;Input: HLIX ;Output: DE is the sqrt, AHL is the remainder ;speed: 754+{0,1}+6{0,6}+{0,3+{0,18}}+{0,38}+sqrtHL ;min: 1130 ;max: 1266 ;avg: 1190.5 ;167 bytes
call sqrtHL add a,a ld e,a jr nc,+_ inc d _:
ld a,ixh sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
;Now we have four more iterations ;The first two are no problem ld a,ixl sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sqrt32_iter15: ;On the next iteration, HL might temporarily overflow by 1 bit sll e \ rl d ;sla e \ rl d \ inc e add a,a adc hl,hl add a,a adc hl,hl ;This might overflow! jr c,sqrt32_iter15_br0 ; sbc hl,de jr nc,+_ add hl,de dec e jr sqrt32_iter16 sqrt32_iter15_br0: or a sbc hl,de _: inc e
;On the next iteration, HL is allowed to overflow, DE could overflow with our current routine, but it needs to be shifted right at the end, anyways sqrt32_iter16: add a,a ld b,a ;either 0x00 or 0x80 adc hl,hl rla adc hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,b inc e or a sbc hl,de \ sbc a,b ret p add hl,de adc a,b dec e add hl,de adc a,b ret
This uses this sqrtHL routine: sqrtHL: ;returns A as the sqrt, HL as the remainder, D = 0 ;min: 376cc ;max: 416cc ;avg: 393cc ld de,$5040 ld a,h sub e jr nc,+_ add a,e ld d,$10 _: sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 5,d res 4,d srl d
set 2,d sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 3,d res 2,d srl d
inc d sub d jr nc,+_ add a,d dec d ;this resets the low bit of D, so `srl d` resets carry. .db $06 ;start of ld b,* which is 7cc to skip the next byte. _: inc d srl d ld h,a
sbc hl,de ld a,e jr nc,+_ add hl,de _: ccf rra srl d rra ld e,a
sbc hl,de jr nc,+_ add hl,de .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: or %00100000 xor %00011000 srl d rra ld e,a
sbc hl,de jr nc,+_ add hl,de .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: or %00001000 xor %00000110 srl d rra ld e,a sbc hl,de jr nc,+_ add hl,de srl d rra ret _: inc a srl d rra ret
sqrtHL was from my work on the float routines, sqrtHLIX was inspired by this thread
44
« on: March 23, 2019, 12:58:23 pm »
I bet if the Z80 gods got in here they could do even better! As it is, I see a neat optimization saving 3 bytes and on average 5cc, but I think I'll wait to make more edits. Basically, in the last iteration instead of:
sqrt32_iter16: add a,a adc hl,hl rla adc hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,0 inc e or a sbc hl,de \ sbc a,0 ret p add hl,de adc a,0 dec e add hl,de adc a,0 ret
It could be:
sqrt32_iter16: add a,a ld b,a ;either 0x00 or 0x128 adc hl,hl rla adc hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,b inc e or a sbc hl,de \ sbc a,b ret p add hl,de adc a,b dec e add hl,de adc a,b ret
I'll have to test that and other optimizations after work (if I have time tonight).
EDIT: Here is an unrolled version that includes the above optimization and works with input=HLIX. This code is 167 bytes, but including the sqrtHL routine posted earlier, the total size is 266 bytes.
sqrtHLIX: ;Input: HLIX ;Output: DE is the sqrt, AHL is the remainder ;speed: 754+{0,1}+6{0,6}+{0,3+{0,18}}+{0,38}+sqrtHL ;min: 1130 ;max: 1266 ;avg: 1190.5 ;167 bytes
call sqrtHL add a,a ld e,a jr nc,+_ inc d _:
ld a,ixh sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
;Now we have four more iterations ;The first two are no problem ld a,ixl sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sqrt32_iter15: ;On the next iteration, HL might temporarily overflow by 1 bit sll e \ rl d ;sla e \ rl d \ inc e add a,a adc hl,hl add a,a adc hl,hl ;This might overflow! jr c,sqrt32_iter15_br0 ; sbc hl,de jr nc,+_ add hl,de dec e jr sqrt32_iter16 sqrt32_iter15_br0: or a sbc hl,de _: inc e
;On the next iteration, HL is allowed to overflow, DE could overflow with our current routine, but it needs to be shifted right at the end, anyways sqrt32_iter16: add a,a ld b,a ;either 0x00 or 0x80 adc hl,hl rla adc hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,b inc e or a sbc hl,de \ sbc a,b ret p add hl,de adc a,b dec e add hl,de adc a,b ret
Thank you so much for the inspiration to work on this routine! I was able to take the results here and make my float routines a little bit faster and I saved 222 bytes in all!
45
« on: March 23, 2019, 10:18:40 am »
Here is my version. It's about 50 bytes larger, but averages 1889.75cc. It does use stack and shadow registers, but it could be made faster by totally unrolling (which would be about 300 bytes of code). sqrt32: ;Input: HLDE ;Output: DE is the square root, AHL is the remainder ;Destroys: D'E', H'L' ;Speed: 248+{0,44}+3*sqrt32sub+sqrt32sub_2+sqrt32_iter15 ;min: 1697cc ;max: 2086cc ;avg: 1889.75cc ; ;Python implementation: ; remainder = 0 ; acc = 0 ; for k in range(16): ; acc<<=1 ; x&=0xFFFFFFFF ; x<<=2 ; y=x>>32 ; remainder<<=2 ; remainder+=y ; if remainder>=acc*2+1: ; remainder=(acc*2+1) ; acc+=1 ; return [acc,remainder] ; di exx ld hl,0 ;remainder ld d,h \ ld e,h ;acc exx
ld a,h \ call sqrt32sub \ exx ld a,l \ call sqrt32sub \ exx ld a,d \ call sqrt32sub \ exx ;Now we have four more iterations ;The first two are no problem ld a,e exx call sqrt32sub_2
;On the next iteration, HL might temporarily overflow by 1 bit call sqrt32_iter15
;On the next iteration, HL is allowed to overflow, DE could overflow with our current routine, but it needs to be shifted right at the end, anyways sqrt32_iter16: add a,a adc hl,hl rla adc hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,0 inc e sbc hl,de \ sbc a,0 ret p add hl,de adc a,0 dec e add hl,de adc a,0 ret
sqrt32sub: ;min: 391cc ;max: 483cc ;avg: 437cc exx call sqrt32sub_2
sqrt32sub_2: ;min: 185cc ;max: 231cc ;avg: 208cc call +_
_: ;min: 84cc ;max: 107cc ;avg: 95.5cc
sll e \ rl d ;sla e \ rl d \ inc e
add a,a adc hl,hl add a,a adc hl,hl
sbc hl,de inc e ret nc dec e add hl,de dec e ret
sqrt32_iter15: ;91+{8,0+{0,23}} ;min: 91cc ;max: 114cc ;avg: 100.75cc
sll e \ rl d ;sla e \ rl d \ inc e add a,a adc hl,hl add a,a adc hl,hl ;This might overflow! jr c,sqrt32_iter15_br0 ; sbc hl,de inc e ret nc dec e add hl,de dec e ret sqrt32_iter15_br0: or a sbc hl,de inc e ret
EDIT:Oh jeez, here is an even bigger version that uses less stack space and doesn't use shadow registers or index registers: sqrt32: ;Input: HLDE ;speed: 238+{0,1}+{0,44}+sqrtHL+3*sqrt32sub_2+sqrt32_iter15 ;min: 1260 ;max: 1506 ;avg: 1377.75
push de call sqrtHL pop bc add a,a ld e,a jr nc,+_ inc d _:
ld a,b call sqrt32sub_2 call sqrt32sub_2 ;Now we have four more iterations ;The first two are no problem ld a,c call sqrt32sub_2
;On the next iteration, HL might temporarily overflow by 1 bit call sqrt32_iter15
;On the next iteration, HL is allowed to overflow, DE could overflow with our current routine, but it needs to be shifted right at the end, anyways sqrt32_iter16: add a,a adc hl,hl rla adc hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,0 inc e or a sbc hl,de \ sbc a,0 ret p add hl,de adc a,0 dec e add hl,de adc a,0 ret
sqrt32sub_2: ;min: 185cc ;max: 231cc ;avg: 208cc call +_
_: ;min: 84cc ;max: 107cc ;avg: 95.5cc
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl
sbc hl,de inc e ret nc dec e add hl,de dec e ret
sqrt32_iter15: ;91+{8,0+{0,23}} ;min: 91cc ;max: 114cc ;avg: 100.75cc
sll e \ rl d ;sla e \ rl d \ inc e add a,a adc hl,hl add a,a adc hl,hl ;This might overflow! jr c,sqrt32_iter15_br0 ; sbc hl,de inc e ret nc dec e add hl,de dec e ret sqrt32_iter15_br0: or a sbc hl,de inc e ret .echo $sqrt32
sqrtHL: ;returns A as the sqrt, HL as the remainder, D = 0 ;min: 376cc ;max: 416cc ;avg: 393cc ld de,$5040 ld a,h sub e jr nc,+_ add a,e ld d,$10 _: sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 5,d res 4,d srl d
set 2,d sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 3,d res 2,d srl d
inc d sub d jr nc,+_ add a,d dec d ;this resets the low bit of D, so `srl d` resets carry. .db $06 ;start of ld b,* which is 7cc to skip the next byte. _: inc d srl d ld h,a
sbc hl,de ld a,e jr nc,+_ add hl,de _: ccf rra srl d rra ld e,a
sbc hl,de jr nc,+_ add hl,de .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: or %00100000 xor %00011000 srl d rra ld e,a
sbc hl,de jr nc,+_ add hl,de .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: or %00001000 xor %00000110 srl d rra ld e,a sbc hl,de jr nc,+_ add hl,de srl d rra ret _: inc a srl d rra ret .echo $sqrtHL
It does use the 16bit square root routine here to take care of the first 16 bits Combined, it is 194 bytes. EDIT2: Forgot that sqrtHL didn't preserve BC, fixed that. Now it seems the last bit of the remainder might be broken, so I have to fix that EDIT3: Fixed the bug in the bottom bit I just needed to reset the carry flag before the second subtraction in the final iteration. EDIT4: In a scenario where you don't have RAM for a stack, we can hardcode it! It even saves 54cc (but adds 20 bytes). 10cc of that 54cc is just due to not having an ending RET. I also switched input to HLIX instead of HLDE. I reorganized the code so that it would be "obvious" that sqrt32 is an inline routine, so I put it at the end (in practice, the subroutines would probably be toward the end of mem). The precomputed stack is inserted just before sqrt32. sqrt32sub_2: ;min: 178cc ;max: 224cc ;avg: 201cc jp return4 return4: ;min: 84cc ;max: 107cc ;avg: 95.5cc
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl
sbc hl,de inc e ret nc dec e add hl,de dec e ret
sqrt32_iter15: ;91+{8,0+{0,23}} ;min: 91cc ;max: 114cc ;avg: 100.75cc
sll e \ rl d ;sla e \ rl d \ inc e add a,a adc hl,hl add a,a adc hl,hl ;This might overflow! jr c,sqrt32_iter15_br0 ; sbc hl,de inc e ret nc dec e add hl,de dec e ret sqrt32_iter15_br0: or a sbc hl,de inc e ret
sqrtHL: ;returns A as the sqrt, HL as the remainder, D = 0 ;min: 376cc ;max: 416cc ;avg: 393cc ld de,$5040 ld a,h sub e jr nc,+_ add a,e ld d,$10 _: sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 5,d res 4,d srl d
set 2,d sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 3,d res 2,d srl d
inc d sub d jr nc,+_ add a,d dec d ;this resets the low bit of D, so `srl d` resets carry. .db $06 ;start of ld b,* which is 7cc to skip the next byte. _: inc d srl d ld h,a
sbc hl,de ld a,e jr nc,+_ add hl,de _: ccf rra srl d rra ld e,a
sbc hl,de jr nc,+_ add hl,de .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: or %00100000 xor %00011000 srl d rra ld e,a
sbc hl,de jr nc,+_ add hl,de .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: or %00001000 xor %00000110 srl d rra ld e,a sbc hl,de jr nc,+_ add hl,de srl d rra ret _: inc a srl d rra ret sqrt32_stack: .dw return0 .dw return4 ;subroutine .dw return1 .dw return4 ;subroutine .dw return2 .dw return4 ;subroutine .dw return3 .dw return5 sqrt32_stack_end:
sqrt32: ;Input: HLIX ;Output: DE is the sqrt, AHL is the remainder ;min: 1203 ;max: 1455 ;avg: 1323.75 ld sp,sqrt32_stack jp sqrtHL return0: add a,a ld e,a jr nc,+_ inc d _:
ld a,ixh jp sqrt32sub_2 return1: jp sqrt32sub_2 return2: ;Now we have four more iterations ;The first two are no problem ld a,ixl jp sqrt32sub_2 return3: ;On the next iteration, HL might temporarily overflow by 1 bit jp sqrt32_iter15 return5:
;On the next iteration, HL is allowed to overflow, DE could overflow with our current routine, but it needs to be shifted right at the end, anyways sqrt32_iter16: add a,a adc hl,hl rla adc hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,0 inc e or a sbc hl,de \ sbc a,0 jp p,+_ add hl,de adc a,0 dec e add hl,de adc a,0 _: ;...
Pages: 1 2 [3] 4 5 ... 306
