This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Messages  Xeda112358
1
« on: Yesterday at 08:37:59 pm »
I rewrote the Input routine and ran into some issues that I finally managed to fix. Now, the cursor blinks, and you can change the location and size of the input buffer! Here is a screenshot where I relocate the input buffer to a spot within the source code (!), and limit it to 9 bytes (8 bytes plus a null byte): The two new "commands" are →Input (Sets the location of the input buffer) and →Input' (Sets the size of the input buffer).
2
« on: August 19, 2019, 03:41:23 pm »
Okay, thanks! There are 26 routines that I'll need to investigate later when I get out of work. Nine of them I don't know if I'll be able to contact the author, but one of those I plan to make a better implementation of anyways. EDIT:How does this sound? 1. This License does not apply to any file with a separate License header. 2. Permission is granted, free of charge, to use, modify, and/or distribute any part of this software for any purpose.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Written by Zeda Thomas <[email protected]>, Aug 2019
3
« on: August 19, 2019, 02:36:11 pm »
That's a good point. At the moment, all but three of the routines are from myself or the calculator forums in their useful routines threads. The ones from UTI are explicitly free to use.
4
« on: August 16, 2019, 11:41:13 pm »
Here are some routines that I've added to the repository: itoa_8Converts an 8bit signed integer to an ASCII string. ;Converts an 8bit signed integer to a string
itoa_8: ;Input: ; A is a signed integer ; HL points to where the nullterminated ASCII string is stored (needs at most 5 bytes) ;Output: ; The number is converted to a nullterminated string at HL ;Destroys: ; Up to five bytes at HL ; All registers preserved. ;on 0 to 9: 252 D=0 ;on 10 to 99: 258+20D D=0 to 9 ;on 100 to 127: 277+20D D=0 to 2 ;on 1 to 9: 276 D=0 ;on 10 to 99: 282+20D D=0 to 9 ;on 100 to 128: 301+20D D=0 to 2
;min: 252cc (+23cc over original) ;max: 462cc (49cc over original) ;avg: 343.74609375cc = 87999/256 ;54 bytes push hl push de push bc push af or a jp p,itoa_pos neg ld (hl),$1A ;start if neg char on TIOS inc hl itoa_pos: ;A is on [0,128] ;calculate 100s place, plus 1 for a future calculation ld b,'0' cp 100 \ jr c,$+5 \ sub 100 \ inc b
;calculate 10s place digit, +1 for future calculation ld de,$0A2F inc e \ sub d \ jr nc,$2 ld c,a
;Digits are now in D, C, A ; strip leading zeros! ld a,'0' cp b \ jr z,$+5 \ ld (hl),b \ inc hl \ .db $FE ; start of `cp *` to skip the next byte, turns into `cp $BB` which will always return nz and nc cp e \ jr z,$+4 \ ld (hl),e \ inc hl add a,c add a,d ld (hl),a inc hl ld (hl),0
pop af pop bc pop de pop hl ret
fixed88_to_stringUses the itoa_8 routine to convert an 8.8 fixedpoint number to a string. ;This converts a fixedpoint number to a string. ;It displays up to 3 digits after the decimal.
fixed88_to_str: ;Inputs: ; D.E is the fixedpoint number ; HL points to where the string gets output. ; Needs at most 9 bytes. ;Outputs: ; HL is preserved ;Destroys: ; AF,DE,BC
;First check if the input is negative. ;If so, write a negative sign and negate push hl ld a,d or a jp p,+_ ld (hl),$1A ;negative sign on TIOS inc hl xor a sub e ld e,a sbc a,a sub d _:
;Our adjusted number is in A.E ;Now we can print the integer part call itoa_8
;Check if we need to print the fractional part xor a cp e jr z,fixed88_to_str_end
;We need to write the fractional part, so seek the end of the string ;Search for the null byte. A is already 0 cpir
;Write a decimal dec hl ld (hl),'.'
ld b,3 _: ;Multiply E by 10, converting overflow to an ASCII digit call fixed88_to_str_e_times_10 inc hl ld (hl),a djnz _
;Strip the ending zeros ld a,'0' _: cp (hl) dec hl jr z,_
;write a null byte inc hl inc hl ld (hl),0
fixed88_to_str_end: ;restore HL pop hl ret
fixed88_to_str_e_times_10: ld a,e ld d,0 add a,a \ rl d add a,a \ rl d add a,e \ jr nc,$+3 \ inc d add a,a ld e,a ld a,d rla add a,'0' ret
sqrtAThis is a very fast, unrolled routine to compute the square root of A. sqrtA: ;Input: A ;Output: D is the square root, A is the remainder (inputD^2) ;Destroys: BC ;speed: 161+{0,6}+{0,1}+{0,1}+{0,3} ;min: 161cc ;max: 172cc ;avg: 166.5cc ;45 bytes ld d,$40
sub d jr nc,+_ add a,d ld d,0 _:
set 4,d sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 5,d res 4,d srl d
set 2,d sub d jr nc,+_ add a,d .db $01 ;start of ld bc,** which is 10cc to skip the next two bytes. _: set 3,d res 2,d srl d
inc d sub d jr nc,+_ add a,d dec d _: inc d srl d ret
sqrtfixed_88An unrolled, fast 8.8 fixedpoint square root routine. Uses the above sqrtA routine. sqrtfixed_88: ;Input: A.E ==> D.E ;Output: DE is the sqrt, AHL is the remainder ;Speed: 690+6{0,13}+{0,3+{0,18}}+{0,38}+sqrtA ;min: 855cc ;max: 1003cc ;avg: 924.5cc ;152 bytes
call sqrtA ld l,a ld a,e ld h,0 ld e,d ld d,h
sla e rl d
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add a,a \ adc hl,hl add a,a \ adc hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
;Now we have four more iterations ;The first two are no problem sll e \ rl d add hl,hl add hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sll e \ rl d add hl,hl add hl,hl sbc hl,de jr nc,+_ add hl,de dec e .db $FE ;start of `cp *` _: inc e
sqrtfixed_88_iter11: ;On the next iteration, HL might temporarily overflow by 1 bit sll e \ rl d ;sla e \ rl d \ inc e add hl,hl add hl,hl jr c,sqrtfixed_88_iter11_br0 ; sbc hl,de jr nc,+_ add hl,de dec e jr sqrtfixed_88_iter12 sqrtfixed_88_iter11_br0: or a sbc hl,de _: inc e
;On the next iteration, HL is allowed to overflow, DE could overflow with our current routine, but it needs to be shifted right at the end, anyways sqrtfixed_88_iter12: ld b,a ;A is 0, so B is 0 add hl,hl add hl,hl rla ;AHL  (DE+DE+1) sbc hl,de \ sbc a,b inc e or a sbc hl,de \ sbc a,b ret p add hl,de adc a,b dec e add hl,de adc a,b ret
ncr_HL_DEComputes 'HL choose DE' in such a way so that overflow only occurs if the final result overflows 16 bits. ; Requires ; mul16 ;BC*DE ==> DEHL ; DEHL_Div_BC ;DEHL/BC ==> DEHL
ncr_HL_DE: ;"n choose r", defined as n!/(r!(nr)!) ;Computes "HL choose DE" ;Inputs: HL,DE ;Outputs: ; HL is the result ; "HL choose DE" ; carry flag reset means overflow ;Destroys: ; A,BC,DE,IX ;Notes: ; Overflow is returned as 0 ; Overflow happens if HL choose DE exceeds 65535 ; This algorithm is constructed in such a way that intermediate ; operations won't erroneously trigger overflow. ;66 bytes ld bc,1 or a sbc hl,de jr c,ncr_oob jr z,ncr_exit sbc hl,de add hl,de jr c,$+3 ex de,hl ld a,h or l push hl pop ix ncr_exit: ld h,b ld l,c scf ret z ncr_loop: push bc \ push de push hl \ push bc ld b,h ld c,l call mul16 ;BC*DE ==> DEHL pop bc call DEHL_Div_BC ;result in DEHL ld a,d or e pop bc pop de jr nz,ncr_overflow add hl,bc jr c,ncr_overflow pop bc inc bc ld a,b cp ixh jr c,ncr_loop ld a,ixl cp c jr nc,ncr_loop ret ncr_overflow: pop bc xor a ld b,a ncr_oob: ld h,b ld l,b ret
EDIT: Optimized itoa_8 above. Here are some more routines: uitoa_8Converts an 8bit unsigned integer to an ASCII string. ;Converts an 8bit unsigned integer to a string
uitoa_8: ;Input: ; A is a signed integer ; HL points to where the nullterminated ASCII string is stored (needs at most 5 bytes) ;Output: ; The number is converted to a nullterminated string at HL ;Destroys: ; Up to four bytes at HL ; All registers preserved. ;on 0 to 9: 238 D=0 ;on 10 to 99: 244+20D D=0 to 9 ;on 100 to 255: 257+2{0,6}+20D D=0 to 5 ;min: 238cc ;max: 424cc ;avg: 317.453125cc = 81268/256 = (238*10 + 334*90+313*156)/256 ;52 bytes
push hl push de push bc push af ;A is on [0,255] ;calculate 100s place, plus 1 for a future calculation ld b,'0' cp 100 \ jr c,$+5 \ sub 100 \ inc b cp 100 \ jr c,$+5 \ sub 100 \ inc b
;calculate 10s place digit, +1 for future calculation ld de,$0A2F inc e \ sub d \ jr nc,$2 ld c,a
;Digits are now in D, C, A ; strip leading zeros! ld a,'0' cp b \ jr z,$+5 \ ld (hl),b \ inc hl \ .db $FE ; start of `cp *` to skip the next byte, turns into `cp $BB` which will always return nz and nc cp e \ jr z,$+4 \ ld (hl),e \ inc hl add a,c add a,d ld (hl),a inc hl ld (hl),0
pop af pop bc pop de pop hl ret
itoa_16Converts a 16bit signed integer to an ASCII string. ;Converts a 16bit signed integer to an ASCII string.
itoa_16: ;Input: ; DE is the number to convert ; HL points to where to write the ASCII string (up to 7 bytes needed). ;Output: ; HL points to the nullterminated ASCII string ; NOTE: This isn't necessarily the same as the input HL. push de push bc push af push hl bit 7,d jr z,+_ xor a sub e ld e,a sbc a,a sub d ld d,a ld (hl),$1A ;negative char on TIOS inc hl _: ex de,hl
ld bc,10000 ld a,'0'1 inc a \ add hl,bc \ jr c,$2 ld (de),a inc de
ld bc,1000 ld a,'9'+1 dec a \ add hl,bc \ jr nc,$2 ld (de),a inc de
ld bc,100 ld a,'0'1 inc a \ add hl,bc \ jr c,$2 ld (de),a inc de
ld a,l ld h,'9'+1 dec h \ add a,10 \ jr nc,$3 add a,'0' ex de,hl ld (hl),d inc hl ld (hl),a inc hl ld (hl),0
;No strip the leading zeros pop hl
;If the first char is a negative sign, skip it ld a,(hl) cp $1A push af ld a,'0' jr nz,$+3 inc hl cp (hl) jr z,$2
;Check if we need to rewrite the negative sign pop af jr nz,+_ dec hl ld (hl),a _:
pop af pop bc pop de ret
uitoa_16Converts a 16bit unsigned integer to an ASCII string. ;Converts a 16bit unsigned integer to an ASCII string.
uitoa_16: ;Input: ; DE is the number to convert ; HL points to where to write the ASCII string (up to 6 bytes needed). ;Output: ; HL points to the nullterminated ASCII string ; NOTE: This isn't necessarily the same as the input HL. push de push bc push af ex de,hl
ld bc,10000 ld a,'0'1 inc a \ add hl,bc \ jr c,$2 ld (de),a inc de
ld bc,1000 ld a,'9'+1 dec a \ add hl,bc \ jr nc,$2 ld (de),a inc de
ld bc,100 ld a,'0'1 inc a \ add hl,bc \ jr c,$2 ld (de),a inc de
ld a,l ld h,'9'+1 dec h \ add a,10 \ jr nc,$3 add a,'0' ex de,hl ld (hl),d inc hl ld (hl),a inc hl ld (hl),0
;No strip the leading zeros ld c,6 add hl,bc ld a,'0' inc hl \ cp (hl) \ jr z,$2 pop af pop bc pop de ret
5
« on: August 14, 2019, 06:04:00 pm »
Good news! I've finished Cemetech's thread and it was tedious as heck. I've also added a bunch of my personal stash that I think is in an acceptable state Currently at about 100 routines. EDIT: Finished porting from the other sites.
6
« on: August 14, 2019, 02:46:30 pm »
On this one? The labels are in the right places, but I do notice that sometimes pressing a key will read as the wrong group EDIT: Also, I'm hoping to put your routines in the repository if you'd like!
7
« on: August 13, 2019, 02:38:20 pm »
Hi folks! I've noticed that the "Z80 Optimized Routines" threads and their equivalents on various sites aren't very easy to navigate. I am starting a repository on GitHub in the hopes of addressing these three issues:  Organization! "Is this routine documented? What page is it on?
 Collaboration! "Is there a better version later in the thread? On what page!? Here is yetanotherversion!"
 Cleanliness! "What is this random request doing in the middle of the thread?"
I initialized the repository here. My plan is to start porting Cemtech's thread, Omnimaga's thread, UnitedTI's thread, Z80 Heaven's routines, and my private routines folder. If you want to help port documentation, I only ask that you cite the original author if possible, except when the original author doesn't care to be cited. If you want to add your own routines, keep it organized! And please, if you see an optimization, please make it! A final note: I think it would be great to have an eZ80 and TIBASIC repository, too, but I don't think I'm up for maintaining that!
8
« on: August 07, 2019, 10:32:25 am »
(P.S. This is what I work on now. Also, I tend to go by bcov77 on other platforms if you feel like googling.)
That is so freaking cool.
9
« on: July 31, 2019, 07:46:51 am »
That is really confusing wording. I think your interpretation is most likely: Does that mean that the ASIC will allow execution on all pages below $180, in other words all of them ?
10
« on: July 29, 2019, 12:37:39 pm »
Oh wow, I hadn't realized that! EDIT: I saw this on that page: NOTE: The contents of this port should NOT be less than 0Ch or the LCD driver will no longer respond.
11
« on: July 29, 2019, 06:21:08 am »
@Sue Doenim : your second routine should use "jr c,", not "jr nz,". I usually go with the second method unless I can get $10 in C, then I use the "in a,(c)" method. I also optionally use compiler directives so the user can use undocumented instructions. For example, in Grammer, I define my LCDDelay routine as: in a,(16) \ rla \ jr c,$3
But one of my favorite tricks that many people don't use (and you'll see in many of my projects) is that if I am only doing fullscreen LCD updates and I don't need interrupts, then at the beginning of my program I disable interrupts and write 80h to port 16 (or BFh to port 16 if you are doing it the weird way). Then I can skip that entire step in my LCD update routine, since I write columnbycolumn and that internal LCD counter is automatically reset to the desired initial value by the end of my routine. It doesn't save much, but it does save space (you almost certainly don't need to worry about an LCD delay between initializing with 80h and the first time you update the LCD), and you save a nonzero number of clock cycles each update, so it really is a "free" optimization.
12
« on: July 28, 2019, 11:49:54 pm »
Hey there, it's ya gender nonspecific diminutive Zeda, here, and today we'll be looking at the FisherYates algorithm and just how freaking efficient it can be for shuffling a list. For reference, it takes one second to shuffle a 999element list at 6MHz, and if that ain't the way your deity intended it, I don't know what is.
First, how do we shuffle L1 in BASIC?
rand(dim(L1>L2 SortA(L2,L1
This is a super clever algorithm, but slow as heck as the lists get bigger. Plus, it uses an extra list of the same size, wasting precious RAM. So how does the FisherYates algorithm work? You start at the last element. Randomly choose an element up to and including the current element and swap them. Now move down one element and repeat (so now the last element is off limits, then the last two, et cetera). Repeat this until there is one element left.
This is easy to perform inplace, and it performs n1 swaps, making it significantly faster than the BASIC algorithm above. In fact, let's implement it in BASIC:
dim(L1>N For(K,N,2,1 randInt(1,K>A L1(K>B L1(A>L1(K B>L1(A End
This takes approximately 37.5 seconds to sort a 999 element list. I don't even have the RAM needed to test the regular method, but extrapolating, it would take the "normal" method approximately 73 seconds for 999 elements. So basically, the FisherYates algorithm is actually faster even in TIBASIC (after about 400 elements, though).
So without further ado, the assembly code!
;Randomizes a TIlist in Ans
_RclAns= 4AD7h seed1 = $80F8 seed2 = $80FC
seed1_0=seed1 seed1_1=seed1+2 seed2_0=seed2 seed2_1=seed2+2 #define bcall(x) rst 28h \ .dw x
.db $BB,$6D .org $9D95
; Put it into 15MHz mode if possible! in a,(2) add a,a sbc a,a out (20h),a
; Initialize the random seed ld hl,seed1 ld b,7 ld a,r _: xor (hl) ld (hl),a inc hl djnz _ or 99 or (hl) ld (hl),a
; Locate Ans, verify that it is a list or complex list bcall(_RclAns) ex de,hl ld c,(hl) inc hl ld b,(hl) inc hl ld (list_base),hl dec a jr z,+_ sub 12 ret nz dec a _:
;A is 0 if a real list, 1 if complex ;HL points to the first element ;BC is the number of elements and $29 ;make it either NOP or ADD HL,HL ld (get_complex_element),a sub 29h sbc a,a ;FF if real, 00 if complex cpl and 9 add a,9 ld (element_size),a
shuffle_loop: push bc
push bc call rand pop bc ex de,hl call mul16 dec bc ;swap elements DE and BC call get_element push hl ld d,b ld e,c call get_element pop de
call swap_elements pop bc dec bc ld a,c dec a jr nz,shuffle_loop inc b dec b jr nz,shuffle_loop ret
swap_elements: ;HL and DE point to the elements element_size = $+2 ld bc,255 _: ld a,(de) ldi dec hl ld (hl),a inc hl djnz _ ret
get_element: ;Input: ; DE is the element to locate ;Output: ; HL points to the element ld l,e ld h,d add hl,hl add hl,hl add hl,hl add hl,de get_complex_element: nop list_base = $+1 ld de,0 add hl,de ret
rand: ;Tested and passes all CAcert tests ;Uses a very simple 32bit LCG and 32bit LFSR ;it has a period of 18,446,744,069,414,584,320 ;roughly 18.4 quintillion. ;LFSR taps: 0,2,6,7 = 11000101 ;291cc ;Thanks to Runer112 for his help on optimizing the LCG and suggesting to try the much simpler LCG. On their own, the two are terrible, but together they are great. ld hl,(seed1) ld de,(seed1+2) ld b,h ld c,l add hl,hl \ rl e \ rl d add hl,hl \ rl e \ rl d inc l add hl,bc ld (seed1_0),hl ld hl,(seed1_1) adc hl,de ld (seed1_1),hl ex de,hl ;;lfsr ld hl,(seed2) ld bc,(seed2+2) add hl,hl \ rl c \ rl b ld (seed2_1),bc sbc a,a and %11000101 xor l ld l,a ld (seed2_0),hl ex de,hl add hl,bc ret
mul16: ;BC*DE ld hl,0 ld a,16 mul16_loop: add hl,hl rl e rl d jr nc,+_ add hl,bc jr nc,+_ inc de _: dec a jr nz,mul16_loop ret
It isn't perfect, but it is pretty good and importantly, it is fast! The biggest problem is in the random number generator, but even that is still pretty good for this application.
13
« on: July 26, 2019, 03:54:48 pm »
I had heard about this on Facebook and through Juju. I'm glad you survived and I hope you'll recover.
Do the police think it was random, or were you targeted?
14
« on: July 21, 2019, 12:47:35 pm »
Since you can use the timers without interrupts I'd imagine that they count independent of whether or not interrupts are enabled or disabled. However, if the timer hit 0 without your acknowledging and then you later EI, it looks like it'll immediately trigger an interrupt. Specifically, I was reading about the timers' loop control ports, but I'm not experienced with the timers yet, so I may have misinterpreted it.
15
« on: June 29, 2019, 01:06:51 pm »
Thanks a bunch, I might think a way to use this! @E37 : there are some pretty decent existing flashtoRAM routines. A classic way to increment HL through RAM pages is to do something like: inc l call z,incHLmem1 ... ... incHLmem1: inc h ret po ld h,a in a,(6) inc a out (6),a ld a,h ld h,40h ret
That method averages between 14cc and 15cc and advances the page as needed. You need to initialize by swapping in the correct page, though. Here is my take on a flashtoRAM routine, though: FlashToRAM: ;Inputs: Same as LDIR, but A is the page number. ;Outputs: ; Same as LDIR, except A is the ending page. ; ;Speed: ;RAM: 21+21n ;ARC, but no boundary: 114+21n ;Arc, on two pages: 21n+269 ;Arc, on three pages: 21n+355 or a jp z,ReadRAM out (6),a add hl,bc ; jr c,read_from_Arc_blocks ;if you need this, you probably need a different routine. This implies that writing will eventually reach the 0x0000 to 0x3FFF range. jp p,read_from_ARC_noboundary read_from_Arc_blocks: ;If we make it here, we know that we cross a page boundary (or in one case, we just reach it and need to return on the next page). ;We will read in blocks to avoid checking page boundaries ;To do so, we first read up to 0x8000  HL bytes xor a sbc hl,bc sub l \ ld l,a ld a,$80 \ sbc a,h \ ld h,a ;now we will subtract BCHL > BC ld a,c \ sub l \ ld c,a ld a,b \ sbc a,h \ ld b,a push bc ld b,h ld c,l xor a \ sub l \ ld l,a ld a,$80 \ sbc a,h \ ld h,a ;now we read the first block block_loop: ldir ;now we increment the page and continue reading from $4000 in a,(6) inc a out (6),a ld h,40h pop bc ;if BC<$4000, just LDIR the rest ld a,b sub h jr c,read_from_RAM ld b,a push bc ld b,h ld c,l jp block_loop read_from_ARC_noboundary: ; or a ;already reset sbc hl,bc read_from_RAM: ldir in a,(6) ld b,a page_restore = $+1 ld a,0 out (6),a ld a,b ld b,c ret ReadRAM: ldir ret
It needs to run in RAM and uses SMC.
