### Author Topic: Hash-Based Instruction Lookup  (Read 1223 times)

0 Members and 1 Guest are viewing this topic.

#### Xeda112358

• they/them
• Moderator
• LV12 Extreme Poster (Next: 5000)
• Posts: 4629
• Rating: +717/-6
• Calc-u-lator, do doo doo do do do.
##### Hash-Based Instruction Lookup
« on: November 04, 2017, 11:24:57 am »
So I was working on some code to take an alphanumeric input string and look it up in a table and do something. After rehashing a binary search algorithm for the nth time, I remembered an old idea that I had for finding a string in set of data.
Spoiler For Spoiler:
I could basically speed up the search by xor-ing each byte of the search string and using that value is an indicator of a potential match. If the input is n bytes long, then I xor the first n starting bytes of the data to search. If the XORed value matches, double check that you haven't already found a match! Otherwise, set variables tail=data_start and head=data_start+n and x as the XORed value of the input, and acc as the XORed value of the first n bytes of the data. Now XOR the byte at tail with acc and store it back to acc. Increment tail, increment head. XOR the byte at head with acc and store it back to acc. If acc==x then double check the string between tail and head isn't a match, otherwise, continue until head goes beyond the data.
After some poking around on Google, I came across a hash based search algorithm that was just the insight that I needed!

So I came up with some sample commands and I was lucky enough to find a simple hashing function that provided a unique value for all 46 functions.

I'm going to provide the code here, even though it isn't very pretty. This way if I ever lose it, I'll have this as reference
This example executes the command "Disp" which I just made display everything after it.
Code: [Select]
#include "grammer3.inc"_DispHL = 4507h_NewLine = 452Eh_PutS   = 450Ah#define bcall(x) rst 28h \ .dw xscrap = 8000hdenom = scrapnumer = scrap+4out   = scrap+8.db $BB,$6D.org $9D95 ld hl,testinput ld bc,testinput_end-testinput call hashlookup jr c,+_ ld hl,s_NOTFOUND bcall(_PutS) ret_: inc de \ inc de ; Now DE points to the pointer to the parsing code ex de,hl ld a,(hl) \ inc hl \ ld h,(hl) \ ld l,a ex de,hl cpi ret po ;HL points to the input, BC is the size left ;DE points to the code that parses the instruction push de rettestinput: .db "Disp Yay, it works!",0testinput_end:hashlookup:;HL points to the input;BC is the max size;return nc if failed to match, c if success ld (input_save),hl ld (input_savesize),bc ld a,b or c jr z,match_null call computehash ld hl,(input_savesize) xor a sbc hl,bc jr z,match_fail ld b,h ld c,l ld d,a ex de,hl add hl,hl ld (hash),hl ld de,hashlut_builtin add hl,de ld e,(hl) inc hl ld d,(hl) ld hl,(input_save);BC is the input size;HL points to the input string;DE points to the comparison ld a,(de) cp c jr nz,match_fail inc de ld b,c_: ld a,(de) inc de cp (hl) jr nz,match_fail inc hl djnz -_ scf retmatch_null: ld de,t_null+1match_fail: ld hl,(input_save) ld bc,(input_savesize) or a retcomputehash: ld e,0_: ld a,(hl) sub 48 ret c cp 10 jr c,$+5    sub 7    ret c    cp 68    ret nc    ld d,a    add a,a    add a,d    xor e    ld e,a    cpi    jp pe,-_    ret        hashlut_builtin:.dw t_null              ;00.dw t_null              ;01.dw t_null              ;02.dw t_null              ;03.dw t_null              ;04.dw t_null              ;05.dw t_null              ;06.dw t_null              ;07.dw t_null              ;08.dw t_null              ;09.dw t_null              ;0a.dw t_cosh              ;0b.dw t_null              ;0c.dw t_null              ;0d.dw t_null              ;0e.dw t_null              ;0f.dw t_null              ;10.dw t_null              ;11.dw t_atan              ;12.dw t_null              ;13.dw t_sinh              ;14.dw t_null              ;15.dw t_null              ;16.dw t_null              ;17.dw t_null              ;18.dw t_null              ;19.dw t_null              ;1a.dw t_null              ;1b.dw t_sqrt              ;1c.dw t_null              ;1d.dw t_null              ;1e.dw t_max               ;1f.dw t_null              ;20.dw t_null              ;21.dw t_ClrDraw           ;22.dw t_null              ;23.dw t_null              ;24.dw t_null              ;25.dw t_null              ;26.dw t_null              ;27.dw t_null              ;28.dw t_Ellipse           ;29.dw t_null              ;2a.dw t_null              ;2b.dw t_null              ;2c.dw t_null              ;2d.dw t_null              ;2e.dw t_null              ;2f.dw t_null              ;30.dw t_null              ;31.dw t_null              ;32.dw t_null              ;33.dw t_null              ;34.dw t_null              ;35.dw t_null              ;36.dw t_null              ;37.dw t_null              ;38.dw t_null              ;39.dw t_ln                ;3a.dw t_null              ;3b.dw t_null              ;3c.dw t_pi                ;3d.dw t_null              ;3e.dw t_null              ;3f.dw t_null              ;40.dw t_null              ;41.dw t_null              ;42.dw t_null              ;43.dw t_null              ;44.dw t_null              ;45.dw t_null              ;46.dw t_null              ;47.dw t_null              ;48.dw t_null              ;49.dw t_null              ;4a.dw t_null              ;4b.dw t_null              ;4c.dw t_null              ;4d.dw t_null              ;4e.dw t_null              ;4f.dw t_null              ;50.dw t_null              ;51.dw t_null              ;52.dw t_null              ;53.dw t_null              ;54.dw t_null              ;55.dw t_null              ;56.dw t_null              ;57.dw t_null              ;58.dw t_null              ;59.dw t_null              ;5a.dw t_null              ;5b.dw t_null              ;5c.dw t_null              ;5d.dw t_null              ;5e.dw t_null              ;5f.dw t_null              ;60.dw t_null              ;61.dw t_null              ;62.dw t_null              ;63.dw t_null              ;64.dw t_null              ;65.dw t_null              ;66.dw t_null              ;67.dw t_null              ;68.dw t_randint           ;69.dw t_asinh             ;6a.dw t_null              ;6b.dw t_tan               ;6c.dw t_null              ;6d.dw t_null              ;6e.dw t_null              ;6f.dw t_null              ;70.dw t_null              ;71.dw t_null              ;72.dw t_null              ;73.dw t_null              ;74.dw t_acosh             ;75.dw t_null              ;76.dw t_null              ;77.dw t_null              ;78.dw t_null              ;79.dw t_null              ;7a.dw t_null              ;7b.dw t_null              ;7c.dw t_null              ;7d.dw t_null              ;7e.dw t_null              ;7f.dw t_null              ;80.dw t_atanh             ;81.dw t_null              ;82.dw t_null              ;83.dw t_null              ;84.dw t_null              ;85.dw t_Line              ;86.dw t_sin               ;87.dw t_null              ;88.dw t_null              ;89.dw t_e                 ;8a.dw t_null              ;8b.dw t_null              ;8c.dw t_mod               ;8d.dw t_null              ;8e.dw t_null              ;8f.dw t_null              ;90.dw t_min               ;91.dw t_Circle            ;92.dw t_gcd               ;93.dw t_null              ;94.dw t_null              ;95.dw t_null              ;96.dw t_null              ;97.dw t_cos               ;98.dw t_null              ;99.dw t_null              ;9a.dw t_null              ;9b.dw t_null              ;9c.dw t_null              ;9d.dw t_null              ;9e.dw t_null              ;9f.dw t_null              ;a0.dw t_log2              ;a1.dw t_null              ;a2.dw t_Tilemap           ;a3.dw t_log10             ;a4.dw t_null              ;a5.dw t_null              ;a6.dw t_null              ;a7.dw t_null              ;a8.dw t_Text              ;a9.dw t_null              ;aa.dw t_null              ;ab.dw t_null              ;ac.dw t_null              ;ad.dw t_Disp              ;ae.dw t_null              ;af.dw t_null              ;b0.dw t_null              ;b1.dw t_null              ;b2.dw t_null              ;b3.dw t_null              ;b4.dw t_null              ;b5.dw t_null              ;b6.dw t_null              ;b7.dw t_DispBuf           ;b8.dw t_lcm               ;b9.dw t_setseed           ;ba.dw t_null              ;bb.dw t_null              ;bc.dw t_null              ;bd.dw t_null              ;be.dw t_null              ;bf.dw t_pow10             ;c0.dw t_null              ;c1.dw t_null              ;c2.dw t_null              ;c3.dw t_null              ;c4.dw t_pow2              ;c5.dw t_null              ;c6.dw t_null              ;c7.dw t_null              ;c8.dw t_null              ;c9.dw t_null              ;ca.dw t_null              ;cb.dw t_null              ;cc.dw t_null              ;cd.dw t_null              ;ce.dw t_null              ;cf.dw t_null              ;d0.dw t_null              ;d1.dw t_null              ;d2.dw t_null              ;d3.dw t_null              ;d4.dw t_null              ;d5.dw t_null              ;d6.dw t_null              ;d7.dw t_null              ;d8.dw t_null              ;d9.dw t_null              ;da.dw t_null              ;db.dw t_null              ;dc.dw t_Shiftbuf          ;dd.dw t_randseed          ;de.dw t_null              ;df.dw t_null              ;e0.dw t_null              ;e1.dw t_exp               ;e2.dw t_Output            ;e3.dw t_null              ;e4.dw t_Sprite            ;e5.dw t_acos              ;e6.dw t_null              ;e7.dw t_Rect              ;e8.dw t_null              ;e9.dw t_null              ;ea.dw t_null              ;eb.dw t_null              ;ec.dw t_rand              ;ed.dw t_null              ;ee.dw t_null              ;ef.dw t_null              ;f0.dw t_null              ;f1.dw t_null              ;f2.dw t_null              ;f3.dw t_null              ;f4.dw t_null              ;f5.dw t_null              ;f6.dw t_null              ;f7.dw t_null              ;f8.dw t_asin              ;f9.dw t_null              ;fa.dw t_null              ;fb.dw t_null              ;fc.dw t_null              ;fd.dw t_null              ;fe.dw t_tanh              ;fft_null:.db 0t_cosh:.db 4,"cosh"#include "commands\cosh.z80"t_atan:.db 4,"atan"#include "commands\atan.z80"t_sinh:.db 4,"sinh"#include "commands\sinh.z80"t_sqrt:.db 4,"sqrt"#include "commands\sqrt.z80"t_max:.db 3,"max"#include "commands\max.z80"t_ClrDraw:.db 7,"ClrDraw"#include "commands\ClrDraw.z80"t_Ellipse:.db 7,"Ellipse"#include "commands\Ellipse.z80"t_ln:.db 2,"ln"#include "commands\ln.z80"t_pi:.db 2,"pi"#include "commands\pi.z80"t_randint:.db 7,"randint"#include "commands\randint.z80"t_asinh:.db 5,"asinh"#include "commands\asinh.z80"t_tan:.db 3,"tan"#include "commands\tan.z80"t_acosh:.db 5,"acosh"#include "commands\acosh.z80"t_atanh:.db 5,"atanh"#include "commands\atanh.z80"t_Line:.db 4,"Line"#include "commands\Line.z80"t_sin:.db 3,"sin"#include "commands\sin.z80"t_e:.db 1,"e"#include "commands\e.z80"t_mod:.db 3,"mod"#include "commands\mod.z80"t_min:.db 3,"min"#include "commands\min.z80"t_Circle:.db 6,"Circle"#include "commands\Circle.z80"t_gcd:.db 3,"gcd"#include "commands\gcd.z80"t_cos:.db 3,"cos"#include "commands\cos.z80"t_log2:.db 4,"log2"#include "commands\log2.z80"t_Tilemap:.db 7,"Tilemap"#include "commands\Tilemap.z80"t_log10:.db 5,"log10"#include "commands\log10.z80"t_Text:.db 4,"Text"#include "commands\Text.z80"t_Disp:.db 4,"Disp"#include "commands\Disp.z80"t_DispBuf:.db 7,"DispBuf"#include "commands\DispBuf.z80"t_lcm:.db 3,"lcm"#include "commands\lcm.z80"t_setseed:.db 7,"setseed"#include "commands\setseed.z80"t_pow10:.db 5,"pow10"#include "commands\pow10.z80"t_pow2:.db 4,"pow2"#include "commands\pow2.z80"t_Shiftbuf:.db 8,"Shiftbuf"#include "commands\Shiftbuf.z80"t_randseed:.db 8,"randseed"#include "commands\randseed.z80"t_exp:.db 3,"exp"#include "commands\exp.z80"t_Output:.db 6,"Output"#include "commands\Output.z80"t_Sprite:.db 6,"Sprite"#include "commands\Sprite.z80"t_acos:.db 4,"acos"#include "commands\acos.z80"t_Rect:.db 4,"Rect"#include "commands\Rect.z80"t_rand:.db 4,"rand"#include "commands\rand.z80"t_asin:.db 4,"asin"#include "commands\asin.z80"t_tanh:.db 4,"tanh"#include "commands\tanh.z80"poparg:    rettostr:    rets_NOTFOUND:    .db "Command Not Found",0input_save:    .dw 0input_savesize:    .dw 0hash:    .dw 0The code for the Disp.z80 file, already convoluted for your viewing pleasure
Code: [Select]
    .dw __tokenizeDisp  ;First code generates the token, returns a pointer to the name in HL, size of the name in BC    .dw __parsecodeDisp    .dw __compiledcodeDisp    .dw __helpDisp__tokenizeDisp:    ld hl,+_    ld bc,1    ret_:    .db tokGraphics+0__helpDisp:    .db 0__compiledcodeDisp:    .db _poparg    .db _tostr    .db _inlineasm \ .dw +_-$-2 bcall(_PutS) ret_:__parsecodeDisp:; call poparg; call tostr bcall(_PutS) bcall(_NewLine) retEdit: Oops, and the grammer3.inc file (don't get your hopes up, it's just a test ). Code: [Select] tokVars =$00   ;allows for 32 variable types (int, str, float, etc.)tokGraphics = $20 ;allows for 32 basic graphics commands (plot(), Pxl(), Rect(), etc.)tokControl =$40   ;allows for 32 control commands (If, Goto, etc.)tokMath     = $60 ;allows for 128 basic math commandstokExtend =$E0tokInclude  = \$F0var_uint8   = 0var_int8    = 1var_uint16  = 2var_int16   = 3var_uint32  = 4var_int32   = 5var_uint64  = 6var_int64   = 7var_float   = 8var_floatext= 9     ;extended precision floatvar_fixed88 = 10    ;8.8 fixed point number, optimized for speed over var_fixedvar_fixed1616=11    ;16.16 fixed point number, optimized for speed over var_fixedvar_fixed   = 12    ;customized size for integer and fractional part, up to 64 bits totalvar_rational= 13    ;customized size for the numerator and denominator of: int8, int16, int32, and int64. Denominator is always unsigned.var_complex = 14    ;complex pair of floats.var_complexext=15   ;complex pair of extended precision floats.var_gaussian= 16    ;complex pair of integers. Customized size for the real and imaginary parts: int8, int16, int32, and int64.var_gaussian8=17var_gaussian16=18var_sprite  = 28var_array   = 29var_list    = 30var_str     = 31_poparg     = 0_pusharg    = 1_tostr      = 2_inlineasm  = 3

#### TIfanx1999

• ಠ_ಠ ( ͡° ͜ʖ ͡°)
• CoT Emeritus
• LV13 Extreme Addict (Next: 9001)
• Posts: 6173
• Rating: +191/-9
##### Re: Hash-Based Instruction Lookup
« Reply #1 on: November 04, 2017, 04:56:37 pm »
So basically, you're playing around with the idea of making another interpreted language(or just having fun messing around with it?). Cool stuff. ^^

#### Xeda112358

• they/them
• Moderator
• LV12 Extreme Poster (Next: 5000)
• Posts: 4629
• Rating: +717/-6
• Calc-u-lator, do doo doo do do do.
##### Re: Hash-Based Instruction Lookup
« Reply #2 on: November 04, 2017, 05:25:07 pm »
Basically, yes (to both). I'm playing with the mechanics of interpreting code and I like this one in particular.