Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - FloppusMaximus

Pages: 1 ... 4 5 [6] 7 8 ... 20

ASM / Re: Streamlined Asm routines

« on: January 07, 2012, 03:53:28 pm »

Quote from: Xeda112358 on January 04, 2012, 02:21:31 pm

Okay, so Runer was speculating about how to get the best speed out of a math routine, so the first challenge he gave was for 8x8 multiplication with a 16-bit output. I am not doing all that well with the challenge, but here is a variation of what I came up with that actually is a 8x16 multiplication (it requires 4 more cycles to make it 8x8).

Unless I've made a mistake somewhere, this routine doesn't work as written, because at the time you're testing the sign flag, the bit you're interested in has already been shifted out. But it's an interesting idea, so here's a version that works (but could probably be optimized more):

Code: [Select]

 ld hl,0   ; 10
 or a   ; 4
 ret z   ; 5

 scf   ; 4
skip_zeroes:
 adc a,a   ; 4(9-n)
 jr nc,skip_zeroes ; 12(9-n) - 5

 jp loop_add0  ; 10

loop_add:
 ret z   ; 5k + 6
 add hl,hl  ; 11k
loop_add0:
 add hl,de  ; 11(k+1)
loop_noadd:
 add a,a   ; 4n
 jr c, loop_add  ; 7n + 5k
 add hl,hl  ; 11(n-k-1)
 jp loop_noadd  ; 10(n-k-1)

If I've worked it out correctly, this has a minimum (non-trivial) running time of 192, a maximum of 437, and average of ~368.57.

The Axe Parser Project / Re: Bug Reports

« on: December 14, 2011, 10:08:23 pm »

Quote from: jacobly on December 11, 2011, 03:11:59 am

Some fixes for this are to only read $8446 if a >= $fc, reset $8446 after it is read, or to change checks for any key that is not a lowercase letter to getKeyʳ^256=key code.
Edit: And in the last case, it should probably be documented somewhere, since it is not obvious just from playing around with getKeyʳ.

I haven't been following this discussion, but I think you mean A ≥ $FB (kExtendEcho3), not $FC. kExtendEcho3 was added in OS 1.15, and is used for various new tokens as well as the TI-Keyboard keycodes. Confusingly, $FB was used as a keycode in older OSes (kwnA) but it wasn't a prefix. For compatibility, I'd recommend that you zero keyExtend before calling GetKey:

Code: [Select]

xor a
ld (keyExtend), a
bcall GetKey ; or GetKeyRetOff, whatever
ld hl, (keyExtend-1)
ld l, a

That way, every keycode is unique and consistent across all OSes.

ASM / Re: 24 bit multiplication

« on: December 11, 2011, 04:41:31 pm »

Quote from: jacobly on December 08, 2011, 08:35:58 pm

My first multiplication routine takes 2746 - 4570 cycles, the second takes 1680 - 2880 cycles.

Oh boy, optimization time

The best I have so far is somewhere around 1800 cycles average (I'm too lazy to work out the exact probabilities at the moment, and not counting memory delays) using a squaring table and undocumented IX instructions. Input is BDE and CHL, output is BCDEAL. This routine works by expanding the formula 2xy = x²+y²-|x-y|², summed over each of the 9 pairs of bytes in the input.

(I'm not saying this is practical - unless you really have thousands of 24-bit multiplications to perform, you don't need this kind of speed. This is just for fun.)

Code: [Select]

SUBFIRST .macro src1, src2, hdest, ldest
 exx
 ld a, src1
 sub src2
 jr nc, $ + 4
 neg
 exx
 ld l, a
 ld a, ldest
 sub (hl)
 ld ldest, a
 inc h
 ld a, hdest
 sbc a, (hl)
 ld hdest, a
  .endm

SUBNEXT .macro src1, src2, hdest, ldest
 dec h
 ex af, af'
 exx
 ld a, src1
 sub src2
 jr nc, $ + 4
 neg
 exx
 ld l, a
 ex af, af'
 ld a, ldest
 sbc a, (hl)
 ld ldest, a
 inc h
 ld a, hdest
 sbc a, (hl)
 ld hdest, a
  .endm

BDE_times_CHL_sqrdiff_v3:
 ld a, d
 exx
 ld h, high(sqrtab)
 ld l, a
 ld e, (hl)
 inc h
 ld d, (hl)  ; DE = d²
 exx
 ld a, b
 exx
 ld l, a
 ld b, (hl)
 dec h
 ld c, (hl)  ; BC = b²
 exx
 ld a, e
 exx
 ld l, a
 ld a, (hl)
 inc h
 ld h, (hl)
 ld l, a   ; HL = e²
 call BC_DE_HL_times_10101
 push bc
  push hl
   push de
    exx
    ld a, h
    exx
    ld h, high(sqrtab)
    ld l, a
    ld e, (hl)
    inc h
    ld d, (hl)  ; DE = h²
    exx
    ld a, c
    exx
    ld l, a
    ld b, (hl)
    dec h
    ld c, (hl)  ; BC = c²
    exx
    ld a, l
    exx
    ld l, a
    ld a, (hl)
    inc h
    ld h, (hl)
    ld l, a  ; HL = l²
    call BC_DE_HL_times_10101
    pop ix
   add ix, de
   pop de
  adc hl, de
  ex de, hl
  pop hl
 adc hl, bc
 ld b, h
 ld c, l   ; BCDEIX = total
 push af

  ld h, high(sqrtab)
  SUBFIRST e, l, ixh, ixl
  SUBNEXT  d, h, d, e
  SUBNEXT  b, c, b, c
  jp nc, BDE_times_CHL_sqrdiff_v3_nc1
  pop af
 ccf
 push af
BDE_times_CHL_sqrdiff_v3_nc1:

  inc b

  dec h
  SUBFIRST e, h, e, ixh
  SUBNEXT  d, c, c, d
  jr nc, BDE_times_CHL_sqrdiff_v3_nc2
  dec b
  jp nz, BDE_times_CHL_sqrdiff_v3_nc2
  pop af
 ccf
 push af
BDE_times_CHL_sqrdiff_v3_nc2:

  dec h
  SUBFIRST d, l, e, ixh
  SUBNEXT  b, h, c, d
  jr nc, BDE_times_CHL_sqrdiff_v3_nc3
  dec b
  jp nz, BDE_times_CHL_sqrdiff_v3_nc3
  pop af
 ccf
 push af
BDE_times_CHL_sqrdiff_v3_nc3:

  inc c

  dec h
  SUBFIRST b, l, d, e
  jr nc, BDE_times_CHL_sqrdiff_v3_nc4
  dec c
  jp nz, BDE_times_CHL_sqrdiff_v3_nc4
  dec b
  jp nz, BDE_times_CHL_sqrdiff_v3_nc4
  pop af
 ccf
 push af
BDE_times_CHL_sqrdiff_v3_nc4:

  dec h
  SUBFIRST e, c, d, e
  pop hl
 jr nc, BDE_times_CHL_sqrdiff_v3_nc5
 dec c
 jp nz, BDE_times_CHL_sqrdiff_v3_nc5
 dec b
 jp nz, BDE_times_CHL_sqrdiff_v3_nc5
 inc l
BDE_times_CHL_sqrdiff_v3_nc5:

 dec b
 dec c

 rr l
 rr b
 rr c
 rr d
 rr e
 ld a, ixl
 ld l, a
 ld a, ixh
 rra
 rr l
 ret


BC_DE_HL_times_10101:
 push bc
  ld a, h
  ex af, af'
  sub a
  ld c, a
  ld b, l
  add hl, bc
  adc a, a
  ld b, e
  add hl, bc
  adc a, c  ; AHL = [ L+H+E L ]
  pop bc
 push hl
  push bc
   ld c, a
   ld b, 0
   ex af, af'
   ld h, a
   add hl, bc  ; no way this can carry (initial HL is a square)
   ld c, a
   ld b, e
   sub a
   add hl, bc
   adc a, a  ; AHL(SP+2) = [ H+E L+H L+H+E L ]
   add hl, de
   adc a, 0  ; AHL(SP+2) = [ H+E+D L+H+E L+H+E L ]
   pop bc
  add hl, bc
  adc a, 0  ; AHL(SP) = [ H+E+D+B L+H+E+C L+H+E L ]
  ld e, d
  ld d, c
  add hl, de
  adc a, b
  jr nc, BC_DE_HL_times_10101_nc1
  inc b   ; BAHL(SP) = [ B B H+E+D+C+B L+H+E+D+C L+H+E L ]
BC_DE_HL_times_10101_nc1:
  add a, e
  jr nc, BC_DE_HL_times_10101_nc2
  inc b   ; BAHL(SP) = [ B D+B H+E+D+C+B L+H+E+D+C L+H+E L ]
BC_DE_HL_times_10101_nc2:
  pop de
 add a, c
 ld c, a
 ret nc
 inc b   ; BCHLDE = [ B D+C+B H+E+D+C+B L+H+E+D+C L+H+E L ]
 ret

To get back to the topic somewhat, ACagliano, it sounds like you're more interested in squaring than in general multiplication. Squaring can be considerably faster, especially if you use a lookup table (e.g., my best 16-bit squaring routine is around 170 cycles, versus around 800 for general multiplication.)

ASM / Re: BIT n,(HL) flags

« on: December 08, 2011, 03:12:47 am »

Ah, of course. They don't actually use the register, they just increment it or decrement it for the heck of it. (Now I wonder why LDI and LDD don't do the same.)

I was hoping the answer would shed some light on the extremely peculiar behavior of the flags following CPI/CPD. Oh well. Thanks for posting this.

ASM / Re: BIT n,(HL) flags

« on: December 06, 2011, 07:49:28 pm »

The undocumented register you're referring to is called W. It's half of a register pair called WZ; there's no way to access either W or Z directly, so the precise details aren't that important. WZ has a shadow register WZ', which is enabled/disabled by the EXX instruction (I really have no idea why.)

You may have read Sean Young's "The Undocumented Z80 Documented" (if you haven't, you should). As far as I know, nobody has written anything more about it than that, so I've had to reverse-engineer everything else myself.

(And of course, this is totally academic; it only matters for people like us who care about perfectly emulating every single bit.)

WZ is used for:
- addresses that are to be jumped to (to avoid having to read and write PC at the same time)
- temporary 16-bit additions, like the IX/IY instructions you mentioned (because there's nowhere else where the value can be stored)
- all input and output addresses (I don't really know why, but I guess it simplifies the internal logic somewhat)

And a few other things. WZ is often incremented or decremented after it's used, which is the only reason the Z register matters at all.

The only instructions I never figured out are CPI(R) and CPD(R), which definitely use WZ in some way, but I never figured out the details.

The only real documentation I have is the TilEm 2 source: z80cmds.h, z80main.h, z80cb.h, z80ed.h, z80ddfd.h.

General Calculator Help / Re: TI/Casio IO cables--what's the difference?

« on: May 09, 2011, 10:34:00 pm »

Quote from: Darl181 on May 05, 2011, 07:52:25 pm

Tried it again, and the link console just seemed to freeze. A few times, an "FF" appeared on the bottom half inverted, so I'm guessing it was part of the
Quote from: FloppusMaximus on May 03, 2011, 09:58:26 pm
8C 97 FF FF

Can someone say what post I'm supposed to watch, or is it the link console. TIA

Yes - I was referring to the "Link Console", not the "Port Monitor" - sorry I wasn't clear.

If you're only seeing FFs, that would seem to indicate it's not working properly. On the other hand, if critor's right about the plugs being different, that could be causing problems as well, and maybe if the plug were seated differently it might be made to work. It'd be interesting to test it with two TI calculators, and see if the individual lines can be controlled as expected.

ASM / Re: compiletime errors

« on: May 09, 2011, 10:22:59 pm »

Note that LD BC,(abc) is faster and smaller than LD HL,(abc) / LD B,H / LD C,L.

Also note that in some cases you might be able to use EX DE,HL instead of LD. LD HL,(abc) / EX DE,HL is the same size and speed as LD DE,(abc).

Also also note that while there are undocumented instructions to load the two halves of IX to and from other registers, those instructions don't work on buggy emulators like the Nspire. So for compatibility you should probably stick to using, e.g., PUSH DE / POP IX to load DE into IX.

Computer Programming / Re: how to set up a windows/linux C++ cross-compiler on linux?

« on: May 09, 2011, 10:13:58 pm »

Debian includes the mingw32 cross compiler; it works fine in my experience. (Actually, it includes both "mingw32" and "gcc-mingw32" - I don't really know what the difference is, apart from politics.) You can also build your own, which is easier than you might think.

General Calculator Help / Re: Can you listen to music on a TI-84+ SE?

« on: May 03, 2011, 11:04:08 pm »

I wonder if you could fit a basic TTS system on a calculator and have it sing to you...

(and now I'm remembering that time at programming camp when some guy, upset that the staff wouldn't let us play M-rated games, got a classroom full of iMacs to sing "we want Quake"... over and over, to the tune of Pomp and Circumstance.)

General Calculator Help / Re: TI/Casio IO cables--what's different?

« on: May 03, 2011, 10:09:48 pm »

Well, all the TI calculators use the same low-level protocol, but it's not an official standard or anything - the only devices that use it are TI calculators, and devices like the CBL and calculator robot that are designed to work with TI calculators.

General Calculator Help / Re: TI/Casio IO cables--what's different?

« on: May 03, 2011, 09:58:26 pm »

Well, I guess if you have Calcsys on your own calc, you could connect it to somebody else's calc and then have them attempt to send a variable. The first thing you see should be (if I remember correctly) 73 68 00 00 - if the data lines are swapped, that would come out as 8C 97 FF FF.

You could, if you wanted to, write assembly programs to send and receive variables over a "twisted" cable (if that is, in fact, what it is.) It wouldn't even be terribly difficult. Getting the system I/O functions to work correctly, though, couldn't be done without modifying the OS.

General Calculator Help / Re: TI/Casio IO cables--what's different?

« on: May 03, 2011, 09:38:17 pm »

That's quite possible. An easy way to test it (if you have two 83+-series calcs) would be to use Calcsys's link console and see if all the bits come out flipped (e.g., if one calculator sends 55, the other receives AA.) If the ground is swapped with one of the data lines, though, you won't be able to send anything at all.

ASM / Re: Mimas file converter HELP!

« on: May 03, 2011, 09:25:26 pm »

Ah, I see, that was a mistake. I intended for 8xvtoasm to write its output by default to an asm file, exactly the opposite of what asmto8xv does, but it doesn't - it writes to standard output by default. Unless anybody objects, I'll change that behavior in the next version.

ASM / Re: How to display pixel in certain location

« on: May 03, 2011, 09:14:02 pm »

For GetPixelLoc, note that shells include the ionGetPixel routine, which does the same thing but uses slightly different inputs (A = column and E = row.) There's also the system routine IOffset (known as FIND_PIXEL on the TI-82), which is similar (though slower, naturally) but doesn't add the address of plotSScreen to HL.

News / Re: Warning about OS 3.0.1 destroying calcs

« on: April 21, 2011, 12:41:20 am »

I'd rather run Windows 3.1 than TI-Nspire 3.0

Pages: 1 ... 4 5 [6] 7 8 ... 20