Omnimaga

Calculator Community => TI Calculators => TI-BASIC => Topic started by: meishe91 on August 25, 2010, 04:47:33 am

Title: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 04:47:33 am
So before I do anything more on my recent project (http://ourl.ca/6789) I told myself I would work on making the moving engine, map display, and that stuff as good as I could before I continued on. So I decided to mess with RLE compression and came up with a program to compress a string. Just decided that I would share it with you guys for learning purposes or if someone wants to use it or something. (Compressed mine by about 800 bytes.)

Code: (RLE Compression) [Select]
"_?Str2
Str1+Ans?Str1
{0,1?L1
DelVar BDelVar A"sub(Str1,B+1,1?u
Repeat u="_
Repeat u?sub(Str1,B+A+1,1
A+1?A
End
{0,A?L2
LinReg(ax+b) Y1
Equ?String(Y1,Str3
Str2+sub(Str3,1,length(Str3)-3)+u?Str2
B+A?B
DelVar AEnd
DelVar Str3DelVar L1DelVar L2DelVar Y1DelVar usub(Str1,1,length(Str1)-1?Str1
sub(Str2,2,length(Str2)-1?Str2

I know there are probably optimizations that could be made to that I just haven't gone over it really to look for any. I kinda just made it so it'd work and then posted it. If someone wants to optimize, go for it :)

Note:
The only thing that needs to be modified in this one is that the space that gets placed at the end needs to just be any token that IS NOT used in the string you are compressing. So if you string being compressed uses spaces then just replace the space with, for example, a question mark. After that you need to replace the space in the fifth row (where your checking to see if the string part equals something) with that same token as in the first row. Hopefully that makes sense.

As a side note to this I realized this didn't work all that well for maps that were all numbers so I made another program that converts those numbers into some other token of your choice. This one will need some adjusting depending on what your string contains but it is fairly easy. If anyone needs it explained I will.

Code: (Different Tokens) [Select]
"_"+Str1?Str2
For(A,2,1+length(Str2
sub(Str2,1,A-1)+sub("SATRWHM",expr(sub(Str2,A,1))+1,1)+sub(Str2,A+1,length(Str2)-A?Str2
End
sub(Str2,2,length(Str2)-1?Str2

Like I said that one needs to be modified depending on your string and such. The one that is posted is specific to how I wanted my string to be outputted and what my original string was.

So, hopefully someone finds this useful or something. Any questions just ask :)
Title: Re: Run-Length Encoding for Strings
Post by: DJ Omnimaga on August 25, 2010, 02:08:56 pm
Hmm interesting. There isn't a lot of compression stuff documented in BASIC. I wonder how fast is yours?
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 02:16:29 pm
Well it runs my 1886 byte string in a little under four minutes and takes off 800 bytes but it'll vary from string to string. So it isn't terribly fast but I don't think it's that slow either, but I don't know really. I'm sure there is a better method or optimizations that could be made.
Title: Re: Run-Length Encoding for Strings
Post by: ztrumpet on August 25, 2010, 02:20:44 pm
How long does it take to decompress them? :)
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 02:22:04 pm
I don't know...haven't made a decompressor yet :P
Title: Re: Run-Length Encoding for Strings
Post by: ztrumpet on August 25, 2010, 02:22:37 pm
lol. :P  Good luck on the decompressor.  I hope it's fast. ;D
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 02:24:27 pm
Haha thanks. I'll post it once I make it. I'll start on it once I get back from Qdoba :P
Title: Re: Run-Length Encoding for Strings
Post by: DJ Omnimaga on August 25, 2010, 04:29:33 pm
I can't wait to see it in action ^^

Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 05:18:31 pm
In a game or just in general? Because just in general it's not all that exciting to wait for a string to compress :P
Title: Re: Run-Length Encoding for Strings
Post by: nemo on August 25, 2010, 06:24:11 pm
looks nice (:

just thought i'd comment that the Equ>String() command is entirely useless.
Code: [Select]
Equ►String(Y1,Str4
is one byte larger than
Code: [Select]
Y1->Str1

and they're the exact same speed. they actually use the same routines. equ>string is completely useless.
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 06:26:45 pm
When I try to do that though I get a ERR:DATA TYPE ???

By the way, I was just telling Z that I knew you'd pop in sometime and find something :P Also, thanks :)
Title: Re: Run-Length Encoding for Strings
Post by: Builderboy on August 25, 2010, 06:30:26 pm
I believe its the String>Equ( command that is the useless one.  But for some reason it doesn't go both ways.  You need Equ>String() to transfer an equation to a string, but you dont need String>Equ() to transfer a String to an equation o.O
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 06:32:22 pm
I believe its the String>Equ( command that is the useless one.  But for some reason it doesn't go both ways.  You need Equ>String() to transfer an equation to a string, but you dont need String>Equ() to transfer a String to an equation o.O

Ya, Builder, I think you're right. That is yet another thing that TI failed upon :P Thanks for clearing that up :)
Title: Re: Run-Length Encoding for Strings
Post by: nemo on August 25, 2010, 06:33:17 pm
oops. my bad, i thought there was only one of those commands.
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 06:35:06 pm
Nah, there is both of those, for what ever reason that we will never know. Thanks though :)
Title: Re: Run-Length Encoding for Strings
Post by: nemo on August 25, 2010, 08:30:53 pm
just changed a few minor things.

Code: [Select]
"_→Str3
Str1+Ans→Str1
"sub(Str1,B+1,1→u    .[2nd]+[7] u, not the lowercase one.
{0,1→L1
DelVar B1→A
Repeat u="     .again, the equation variable u. one space.
Repeat u≠sub(Str1,B+A,1
A+1→A
End
{0,A-1→L2
LinReg(ax+b) Y1
Equ►String(Y1,Str4
Str3+sub(Str4,1,length(Str4)-3)+u→Str3      .must be in this order
B+A-1→B
1→A
End
sub(Str3,2,length(Str3)-1→Str3

i'm not 100% sure all the changes are made are valid, since i don't have any strings to compress into RLE and only tested things on the homescreen. something you might want to consider is to initialize A as 0 instead of one. for one, this allows you to use DelVar. secondly, statements like B+A-1→B and {0,A-1→L2 become B+A→B and {0,A→L2, saving a few bytes. also, one thing to consider is a limit. since one character can only hold numbers 0-9, what if someone in their string has 13 2's in a row? your compressed string would show 132, but the decompression routine would read it as 1 tile #3, and then 2 of the next tile #.
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 08:58:17 pm
Well just from looking at it it looks like those are all valid. Thanks. I didn't think about using u to store sub(Str1,B+1,1, though that is still a new thing for me. I'm testing it all right now. As for the set A to zero first thing, I thought of that but didn't wanna bother with adjusting things if it needed to be done at the time but I can look at it. It looks like you're right though, I think the only change would be to change row seven to Repeat u≠(Str1,B+A+1,1. As for the limit thing I'm not sure. I mean I know what you mean, since my original string was just numbers, but that's why I provided that second set of code to replace the numbers with a different token so it can be read easily.

So, as I was writing this it finished compressing and those changes did work. Though it took a little longer because, I think, it has to solve u each time it comes up. Which isn't a big deal though. I'm currently testing the changes with A.

As for a decompresser, I am still working on it. I'm close, I think. I got something to work but I don't think it worked properly because the final string ended up being like 200 bytes less than it normally is ??? The weird part is that the beginning and end both match up just fine though, so I don't know what is going wrong with it (unless someone else can explain a loss in bytes).

And again, while writing this the changing A program finished and it seemed to work as well. I'll make these changes to the first post though. Thanks again, nemo.
Title: Re: Run-Length Encoding for Strings
Post by: nemo on August 25, 2010, 09:39:52 pm
np meishe. yeah, the slowdown is probably due to u being solved each time. but i never aim for speed in a compression routine, since compression routines aren't likely to be used in games, just the corresponding decompression routine.

i'm thinking your data loss is due to the limit issue i have with your code. make your first 11 characters in the string your trying to compress the same tile number, and then compress the string. see if you get a matching beginning sequence after decompressing the compressed string.
i'm thinking that the string "111111111111" 12 1's, will compress to "121" which will then give an error during decompression.

made a few more changes, added in my limit idea (i still think it's a good idea):
Code: [Select]
DelVar BDelvar A"_→Str3
Str1+Ans→Str1
"sub(Str1,B+1,1→u   
Repeat u="     
Repeat A=9 or u≠sub(Str1,B+A+1,1
A+1→A
End
Str3+sub("0123456789",A+1,1)+u→Str3      .must be in this order
B+A→B
Delvar AEnd
sub(Str3,2,length(Str3)-1→Str3

if you want to have your strings be compressed where the highest run can be greater than ten,  you can easily substitute A=9 for a higher number, and add onto the sub("0123456789") command. for example, to expand it to 13 you would have A=12 and sub("0123456789ABCD",A+1,1), but your decompression routine would have to be modified to allow that.
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 09:50:25 pm
I'm fairly sure the limit thing is not the issue because I don't have my map as numbers. I changed the whole map to letter tokens to help make it readable, via the second code. So my compressed data looks something like "19A12B4D5A..." I actually might have been reading the wrong string data though, though I never checked. Probably should have :P I can check later. But the decompression routine I have now relies on the fact that you are using a non-integer token for your map. I'm still working on it but it looks like it is working.

On the limit thing, I never said it wasn't a good idea. I said it was unnecessary for what I was doing because it was just to compress the data, which is in letter tokens, down. If I were to use RLE for my game I would probably have the limit thing for a couple reasons.
Title: Re: Run-Length Encoding for Strings
Post by: nemo on August 25, 2010, 09:54:08 pm
i don't follow how you store your data, but ok. good luck with your routines, i hope to see them (:
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 10:12:04 pm
Ok, sorry. I'll try to explain. I'll use part of my actual map as the example. So my original map data goes as shown:

Code: [Select]
111111111111111
100300020000000
100305020000000
100300020000000
100344420000000

So the actual sting looks like:

Code: [Select]
"111111111111111100300020000000100305020000000100300020000000100344420000000→Str1
When I was going to compress the data the first time I ran into that issue you were saying, how like the first part will say "151." So instead of setting a limit, which I thought about, I decided I just wanted to compress this thing in whole without a limit to see how much memory I could save. But with the map being in numbers it was to hard to read so I came up with that second program:

Code: [Select]
"_"+Str1→Str2
For(A,2,1+length(Str2
sub(Str2,1,A-1)+sub("SATRWHM",expr(sub(Str2,A,1))+1,1)+sub(Str2,A+1,length(Str2)-A→Str2
End
sub(Str2,2,length(Str2)-1,Str2

So since each number represented a different tile I just matched the number up with a letter that represented the tile.

0=Space=S
1=Border Tree=A
2=Mid-Map Tree=T
3=Rock=R
4=Water=W
5=House=H
6=Mart=M

That program just replaces each number with its respective token turning the map into:

Code: [Select]
AAAAAAAAAAAAAAA
ASSRSSSTSSSSSSS
ASSRSHSTSSSSSSS
ASSRSSSTSSSSSSS
ASSRWWWTSSSSSSS

Therefore making the string now:

Code: [Select]
AAAAAAAAAAAAAAAASSRSSSTSSSSSSSASSRSHSTSSSSSSSASSRSSSTSSSSSSSASSRWWWTSSSSSSS→Str1
Using that string there is no issue reading "16A2S1R3S..." So now if I can get the decompression to work correctly you don't run into the issue of it reading three numbers in a row or only two or something. Does that make more sense?
Title: Re: Run-Length Encoding for Strings
Post by: DJ Omnimaga on August 25, 2010, 10:21:35 pm
In a game or just in general? Because just in general it's not all that exciting to wait for a string to compress :P
in a game demo or something. Just to see how fast it decompresses
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 25, 2010, 11:45:33 pm
Ok, so a little update. First I'm gonna post the modified code that nemo made that just cleans everything up so you only have your original string and the outputted string left. Just makes it a little cleaner in the end, I think.

Code: (RLECMPN) [Select]
DelVar BDelVarA"_→Str2
Str1+Ans→Str1
"sub(Str1,B+1,1→u
Repeat u="_
Repeat A=9 or u≠sub(Str1,B+A+1,1
A+1→A
End
Str2+sub("0123456789",A+1,1)+u→Str2
B+A→B
DelVar AEnd
DelVar ADelVar BDelVar usub(Str1,1,length(Str1)-1→Str1
sub(Str2,2,length(Str2)-1→Str2

Ok, now for the decompression part. While I am still working on decompressing what my compression makes but I do have a way of decompressing what nemo's code gives you.

Code: (RLEDECMN) [Select]
"_→Str3
For(A,1,length(Str2)-1,2
For(B,1,expr(sub(Str2,A,1
Str3+sub(Str2,A+1,1→Str3
End
End
DelVar ADelVar Bsub(Str3,2,length(Str3)-1→Str3

Edit:
Figured out how to decompress mine now too :)

Code: (RLEDECMP) [Select]
"_→Str3
Str2+Ans→Str2
DelVar A1→B
Repeat "_"=sub(Str2,B+A,1
Repeat not(inString("0123456789",sub(Str2,B+A,1
A+1→A
End
For(C,1,expr(sub(Str2,B,A
Str3+sub(Str2,B+A,1→Str3
End
B+A+1→B
DelVar AEnd
DelVar ADelVar BDelVar Csub(Str2,1,length(Str2)-1→Str2
sub(Str3,2,length(Str3)-1→Str3

I'm sure there are some optimizations to be made.
Title: Re: Run-Length Encoding for Strings
Post by: ztrumpet on August 26, 2010, 10:03:30 am
Cool!  How does this impact the speed of your game? :)
Title: Re: Run-Length Encoding for Strings
Post by: meishe91 on August 26, 2010, 05:57:03 pm
I haven't implemented any of it yet :P I'm debating what I wanna do in terms of using this stuff.