Omnimaga
Calculator Community => Other Calc-Related Projects and Ideas => TI Z80 => Topic started by: Xeda112358 on July 28, 2019, 11:49:54 pm
-
Hey there, it's ya gender non-specific diminutive Zeda, here, and today we'll be looking at the Fisher-Yates algorithm and just how freaking efficient it can be for shuffling a list. For reference, it takes one second to shuffle a 999-element list at 6MHz, and if that ain't the way your deity intended it, I don't know what is.
First, how do we shuffle L1 in BASIC?
rand(dim(L1->L2
SortA(L2,L1
This is a super clever algorithm, but slow as heck as the lists get bigger. Plus, it uses an extra list of the same size, wasting precious RAM. So how does the Fisher-Yates algorithm work? You start at the last element. Randomly choose an element up to and including the current element and swap them. Now move down one element and repeat (so now the last element is off limits, then the last two, et cetera). Repeat this until there is one element left.
This is easy to perform in-place, and it performs n-1 swaps, making it significantly faster than the BASIC algorithm above. In fact, let's implement it in BASIC:
dim(L1->N
For(K,N,2,-1
randInt(1,K->A
L1(K->B
L1(A->L1(K
B->L1(A
End
This takes approximately 37.5 seconds to sort a 999 element list. I don't even have the RAM needed to test the regular method, but extrapolating, it would take the "normal" method approximately 73 seconds for 999 elements. So basically, the Fisher-Yates algorithm is actually faster even in TI-BASIC (after about 400 elements, though).
So without further ado, the assembly code!
;Randomizes a TI-list in Ans
_RclAns= 4AD7h
seed1 = $80F8
seed2 = $80FC
seed1_0=seed1
seed1_1=seed1+2
seed2_0=seed2
seed2_1=seed2+2
#define bcall(x) rst 28h \ .dw x
.db $BB,$6D
.org $9D95
; Put it into 15MHz mode if possible!
in a,(2)
add a,a
sbc a,a
out (20h),a
; Initialize the random seed
ld hl,seed1
ld b,7
ld a,r
_:
xor (hl)
ld (hl),a
inc hl
djnz -_
or 99
or (hl)
ld (hl),a
; Locate Ans, verify that it is a list or complex list
bcall(_RclAns)
ex de,hl
ld c,(hl)
inc hl
ld b,(hl)
inc hl
ld (list_base),hl
dec a
jr z,+_
sub 12
ret nz
dec a
_:
;A is 0 if a real list, -1 if complex
;HL points to the first element
;BC is the number of elements
and $29 ;make it either NOP or ADD HL,HL
ld (get_complex_element),a
sub 29h
sbc a,a
;FF if real, 00 if complex
cpl
and 9
add a,9
ld (element_size),a
shuffle_loop:
push bc
push bc
call rand
pop bc
ex de,hl
call mul16
dec bc
;swap elements DE and BC
call get_element
push hl
ld d,b
ld e,c
call get_element
pop de
call swap_elements
pop bc
dec bc
ld a,c
dec a
jr nz,shuffle_loop
inc b
dec b
jr nz,shuffle_loop
ret
swap_elements:
;HL and DE point to the elements
element_size = $+2
ld bc,255
_:
ld a,(de)
ldi
dec hl
ld (hl),a
inc hl
djnz -_
ret
get_element:
;Input:
; DE is the element to locate
;Output:
; HL points to the element
ld l,e
ld h,d
add hl,hl
add hl,hl
add hl,hl
add hl,de
get_complex_element:
nop
list_base = $+1
ld de,0
add hl,de
ret
rand:
;Tested and passes all CAcert tests
;Uses a very simple 32-bit LCG and 32-bit LFSR
;it has a period of 18,446,744,069,414,584,320
;roughly 18.4 quintillion.
;LFSR taps: 0,2,6,7 = 11000101
;291cc
;Thanks to Runer112 for his help on optimizing the LCG and suggesting to try the much simpler LCG. On their own, the two are terrible, but together they are great.
ld hl,(seed1)
ld de,(seed1+2)
ld b,h
ld c,l
add hl,hl \ rl e \ rl d
add hl,hl \ rl e \ rl d
inc l
add hl,bc
ld (seed1_0),hl
ld hl,(seed1_1)
adc hl,de
ld (seed1_1),hl
ex de,hl
;;lfsr
ld hl,(seed2)
ld bc,(seed2+2)
add hl,hl \ rl c \ rl b
ld (seed2_1),bc
sbc a,a
and %11000101
xor l
ld l,a
ld (seed2_0),hl
ex de,hl
add hl,bc
ret
mul16:
;BC*DE
ld hl,0
ld a,16
mul16_loop:
add hl,hl
rl e
rl d
jr nc,+_
add hl,bc
jr nc,+_
inc de
_:
dec a
jr nz,mul16_loop
ret
It isn't perfect, but it is pretty good and importantly, it is fast! The biggest problem is in the random number generator, but even that is still pretty good for this application.
-
Neat! It's funny that implementing it in BASIC is Faster than using TI's built in tools. :)
-
How could I implement this in Grammer. I'm making solitaire and it's pretty tedious. Since you can't do a decremented loop I was thinking about using Repeat although it would mean more bytes used. I could also use a z80 routine that works. I'm building with Mimas 0.4 so I don't know how to implement the labels. It gives and error when I try to set seed1 = to what ever the address was.
-
For the assembly version, you can replace the seed= with just a label, "seed:". Then the (seed) instances need (seed+1).
As for a Grammer version, assuming a 52-card deck, you can initialize once with:
π9872→Z
For K,0,51
WriteB(Z+K,K
End
Then every time you want to shuffle, assuming Z still points to the deck:
.SHUFFLE
51→K
While K>0
randInt(0,K→A
(Z+K→B
WriteB(Z+K,(Z+A
WriteB(Z+A,B
K-1→K
End
End
It's honestly a beautiful algorithm and works well. It helps that Grammer has a pretty good pseudo-random number generator (statistically fantastic, especially for only providing 16 bits at a time :P )
-
For the assembly version, you can replace the seed= with just a label, "seed:". Then the (seed) instances need (seed+1).
As for a Grammer version, assuming a 52-card deck, you can initialize once with:
π9872→Z
For K,0,51
WriteB(Z+K,K
End
Then every time you want to shuffle, assuming Z still points to the deck:
.SHUFFLE
51→K
While K>0
randInt(0,K→A
(Z+K→B
WriteB(Z+K,(Z+A
WriteB(Z+A,B
K-1→K
End
End
It's honestly a beautiful algorithm and works well. It helps that Grammer has a pretty good pseudo-random number generator (statistically fantastic, especially for only providing 16 bits at a time :P )
Yah it helps a lot because I knew you could implement it, I just didn't know how. :>. Now I Do! Thanks! I'm gonna continue making solitaire now. Did you get the number displaying bug fixed yet?
EDIT: Never mind on the bug. I was on an outdated version... Silly me...