Omnimaga

Calculator Community => Other Calc-Related Projects and Ideas => TI Z80 => Topic started by: gtaforever00 on October 30, 2011, 08:32:36 pm

Title: Text compression for AXE programs
Post by: gtaforever00 on October 30, 2011, 08:32:36 pm
Well I have just been getting a hang of the AXE language and decided my first project is going to be some kind of text compression for my future projects.  I was thinking if you use A thru Z, <space>, <period>, <question mark>, <comma>, <single quote>, <colon>.  That is 32 characters altogether.  You can use 5 bits instead of the 8 bits and therefore saving about 35% space.  If you had 500 characters (500 bytes) and use this theory your total memory usage would be approximately 313 bytes.   What does everybody think?  Would this benefit anybody in their current projects or future ones?  Input would be appreciated thanks.
Title: Re: Text compression for AXE programs
Post by: epic7 on October 30, 2011, 09:12:44 pm
That would be useful.
Title: Re: Text compression for AXE programs
Post by: Hayleia on October 31, 2011, 04:00:56 am
I was thinking if you use A thru Z, <space>, <period>, <question mark>, <comma>, <single quote>, <colon>.  That is 32 characters altogether.  You can use 5 bits instead of the 8 bits and therefore saving about 35% space.
You forgot something: There is not only A,B,C,... but also a,b,c,... ;)
But yeah, that would be very useful :D (don't forget the decompression algorithm :P)
Title: Re: Text compression for AXE programs
Post by: AngelFish on October 31, 2011, 04:21:54 am
a,b, and c are two byte tokens on-calc. They aren't represented the same as normal ASCII.
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on October 31, 2011, 09:40:01 am
Well I was also thinking about the lowercase letters, but in reality they just look good and take up more space *.*.  I might implement later another version that does more characters if there is enough support, but it will also be less compression too.  I have almost got the compression engine finished :thumbsup:.  It looks pretty solid  :banghead:,  I have not had too many bugs in it that were not easily fixable.   Axe is definitilely easier to program in if you look at it like assembly code in a sense but with syntax like basic.
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on October 31, 2011, 01:06:58 pm
Well update to the project.  The input for the program is the TI-OS Str1 and the output is Str2.  In the demo program the input string is 162 bytes and the output is 108 bytes.  That is about 33% compression with the strings.  Imagine working on a RPG with 2500 bytes of text,  you compress it 33% that is ~1675 bytes.  Nearly a 1000 bytes less than the original now that is pretty impressive.    I will probably have it store the data as hexadecimal in Str2 for convience.  Any suggestions will be considered.
Title: Re: Text compression for AXE programs
Post by: Builderboy on October 31, 2011, 02:20:25 pm
a,b, and c are two byte tokens on-calc. They aren't represented the same as normal ASCII.

Yes they are, because we are talking about character codes, not tokens.  The ASCII chart that the calc uses has 256 characters that are each 1 space long.  Each can be referenced by a single byte number, using the Axe command >Char. 
Title: Re: Text compression for AXE programs
Post by: FinaleTI on October 31, 2011, 02:48:41 pm
This sounds neat! I could possibly use this for Nostalgia, as I only use uppercase letters.
Title: Re: Text compression for AXE programs
Post by: C0deH4cker on October 31, 2011, 03:00:26 pm
Looks interesting... I have a project that could benefit from this...
Title: Re: Text compression for AXE programs
Post by: Hayleia on October 31, 2011, 03:38:19 pm
That is about 33% compression with the strings.
O.O

Any suggestions will be considered.
Please, please, please add lowercase <insert begging smiley here>
In my game, I have more than 1900 bytes of text data. You compression would save a lot of space.
Title: Re: Text compression for AXE programs
Post by: parserp on October 31, 2011, 04:47:47 pm
This is way cool!!!
and yes please do include lowercase. <insert puppy dog eyes> :P
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on October 31, 2011, 05:25:08 pm
Ok i'll see what I can do with the lowercase letters.  Another thing,  what about numbers?  With my current routine and plans on the lowercase,  I can only have about 64 characters.  I could take out some of the symbols if numbers was needed.
Title: Re: Text compression for AXE programs
Post by: parserp on October 31, 2011, 05:29:22 pm
what do you mean by symbols?
Title: Re: Text compression for AXE programs
Post by: Quigibo on October 31, 2011, 05:52:54 pm
Another idea, during decompression, you can automatically make the first letter of every sentence uppercase, and the rest lowercase. That way it still looks nice, but it takes up the same amount of space.  You can use other rules too to capitalize things in quotes.  Also, don't forget about characters '0' through '9', those are probably important too.
Title: Re: Text compression for AXE programs
Post by: Hayleia on November 01, 2011, 03:31:07 am
Ok i'll see what I can do with the lowercase letters.  Another thing,  what about numbers?  With my current routine and plans on the lowercase,  I can only have about 64 characters.  I could take out some of the symbols if numbers was needed.
Personally, I use numbers 1,2,3,4 and symbols !?.,-':
I think that is all. But yeah, I am not alone in this planet, so do what you think is fair :P

Another idea, during decompression, you can automatically make the first letter of every sentence uppercase, and the rest lowercase. That way it still looks nice, but it takes up the same amount of space.  You can use other rules too to capitalize things in quotes.  Also, don't forget about characters '0' through '9', those are probably important too.
But if we want to display "Hi Ginny !", then Ginny will be with lowercase D:
But yeah, that would save space.
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on November 01, 2011, 10:31:02 pm
Ok since the last few posts, I have decided just to start over and rebuild the algorithm from scratch. :o  I had to add another bit to the data to make 64 characters.  The maximum compression would be around 25%.  I had to add a little data, but hopefully stay in the 20's. ::)  I have decided to use Quigibo's idea with the decoder.  Thanks for idea Quigibo. ;D(http://www.omnimaga.org/Themes/default/images/gpbp_arrow_up.gif)  I will have rules on how to uppercase the first letter of every sentence and every single I that is part of the sentence structure.  It will automatically uppercase the first word of every sentence, uppercase every singe I, and if you want a certain character uppercase then all you have to do is add a token in front of the character to make it uppercase.  It adds a little more space to the data, but you get more control and more compression than using other ideas.

All the letters, numbers, and countless symbols will be usable.  I may start a poll to see what would be most wanted on the special symbols.  Well I think I am going to start building this beast! :evillaugh:
Title: Re: Text compression for AXE programs
Post by: C0deH4cker on November 02, 2011, 10:36:27 am
Good idea with the uppercase letters! cant wait to see this when its done.
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on November 03, 2011, 02:29:23 pm
Well I finally got the compression engine made.  It outputs in hex to TI-OS Str2 and the input is a string in the TI-OS Str1.  On the decompression,  I will probably store the decoded text in an appvar.  Does that sound ok to everyone?  I want to try and not use any free ram or variables, as I want this to be easily incorporated into any program.  I will probably use the subroutine variables r1 - r6 and the appvar.  The compression is around 20% to 25%.  The more characters you have the more compression it seems to do.
Title: Re: Text compression for AXE programs
Post by: Keoni29 on November 03, 2011, 02:44:23 pm
Cool! Lowercase supported?
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on November 04, 2011, 01:31:56 pm
Well here is a screenshot of the current state:


(http://www.fileden.com/files/2011/11/7/3221137/TI_83_84_Projects/Text_Compression_AXE/Screenshots/testing_00.gif)

Str1 is the input it is 151 bytes
Str2 is the output and is 119 bytes O.O
Str3 is the output in hex it is 227 bytes

Instead of the Hex output,  I could just use the raw output but it does not always look pretty in TI-OS but would save space in the source.

I have started on the decompression and it is going well so far.  I have also thought about doing maybe a simple lzw compression on the already compressed data to compress it even more or even as the primary compression technique if it yields better results. ???  

I will still take any suggestions you may have.  I feel getting input from a group of users that would use the tool will result in a better program in the end.   ;D

Cool! Lowercase supported?

Yes lowercase will be supported through the decompression.
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on November 05, 2011, 10:18:52 pm
Ok here is another screenie for you all. :hyper:

(http://www.fileden.com/files/2011/11/7/3221137/TI_83_84_Projects/Text_Compression_AXE/Screenshots/testing_01.gif)

23% is not too bad.  The decompression engine is coming along nicely.  I am just working out a few bugs in the code and hopefully to finish here soon.
Title: Re: Text compression for AXE programs
Post by: gtaforever00 on November 07, 2011, 02:21:07 pm
Ok here is every character available on the calculators:

(http://www.fileden.com/files/2011/11/7/3221137/TI_83_84_Projects/lgfont.png)

If you want to post the top 10 symbols you would want the most and a possible token to use for its representation on calc.

I already have all letters, numbers, period, question mark, exclamation, comma, single quote, opening parenthesis, closing parenthesis, colon, and negative sign so please do not post these again.

Example:
    Hex character / TI on-calc token
1. 05h (right pointing arrow) / > token
2. 3Bh (semi colon)/ i (complex) token
3. F2h (money sign) / pi symbol token


Hex (row|column)h