Compressing maps

Calculator Community > Other Calc-Related Projects and Ideas

Compressing maps

<< < (2/2)

MGOS:
well, the problem with that is the fact that the values each character can take are very limited. If you're using a LUT-String with InString( you can get to 256 different (probably less) values per character. Since 2*3*5*7 is already 210, 4 maps is the maximum to store in a string using the prime factor compression. I'm not sure whether strings are still smaller than matrices when you use two characters for each square. And it would be real mess to code.
In case you only need to store two values (1's and 0's) in each map, a binary compression would be a lot more efficient. Each bit of a number would be the value of the square in the corresponding map. With binary compression, you could store up to 7 (maybe 8 ) maps in a string, and a lot more in the matrices. When more different tiles are needed, it can be rather easily changed that a group of 2, 3 or 4 bits represent one tile (same method; explained in the initial post of sjasogun1).

sjasogun1:

--- Quote from: Builderboy on March 27, 2013, 12:03:31 pm ---This sounds like a very intriguing new way to save memory! You better watch out though, since even though the maximum value for a number in a matrix is 10^99, the numbers are stored in floating point format. What this means is that even though you can store very large numbers, the precision of them will only be about the last 14 digits or so (depending on your calculator model). This will result in extremely large numbers being seemingly divisible by numbers they cannot be divisible by. For example, a large enough power of 2 will return 0 when you ask for remainder 3.

--- End quote ---

You're absolutely right! I just tested it, and 14 digits is indeed the maximum precision for my model, which lowers the limit for 1-feature maps to 12 and 10 for 2-feature maps. A bit of a bummer, but still a significant gain, especially when considering that you can always take another matrix to store a dozen extra maps.

--- Quote from: Xeda112358 on March 27, 2013, 01:35:50 pm ---Another method that you might be interested in works like this:
Say you have 10 different tiles and numbers are stored with 14 digits of precision. You can then store each map layer with the nth digit in the number. To access the map in TI-BASIC code:

--- Code: ---10fPart(10^-N[A]

--- End code ---
If you aren't familiar with TI-BASIC, fPart( simply grabs the decimal part of a number (leaving off the integer part). If the language you are using doesn't have that, there is probably some equivalent to a floor() function (round down to the nearest integer) and you would essentially do #-floor(#).

If you have M different tiles, you can theoretically store up to floor(14/log(M)) different maps in a single matrix element, this way. However, I would go with 1 less than that to avoid precision problems. Your code for extracting a given matrix would be to replace the 10 in the above code with M.

I hope to see more ideas :D Encoding data with primes and properties of numbers can have some pretty interesting effects.

--- End quote ---

That's a very interesting method! There are only two small downsides it really has: one is being limited to a maximum of 10 features before it reduces your storage capability by half. But when you use two decimal places for each tile instead of 1 you can store as many as 100(!) different tiles, allowing you to store maps that are more artistic, while still reducing your storage by a factor of 6 (I take 12 decimal places as the limit just to make sure I don't go over it, otherwise the factor would be 7 (14/2)). The second is that this storage method is far less efficient when you use it with maps that don't need 10 different features, but even so it is more efficient than my method since it allows for up to 13 maps to be stored in a single matrix as opposed to my 12 maps which is only in the best-case-scenario of using only 1 terrain feature.

Hmmm... I think I might change the first post a bit to make this thread more of a discussion for different methods to compress maps, and perhaps even to simply 'compression methods' if the need to discuss compression for data other than maps arises.

Xeda112358:

--- Quote from: sjasogun1 on March 28, 2013, 06:44:09 am ---
--- Quote from: Xeda112358 on March 27, 2013, 01:35:50 pm ---Another method that you might be interested in works like this:
Say you have 10 different tiles and numbers are stored with 14 digits of precision. You can then store each map layer with the nth digit in the number. To access the map in TI-BASIC code:

--- Code: ---10fPart(10^-N[A]

--- End code ---
If you aren't familiar with TI-BASIC, fPart( simply grabs the decimal part of a number (leaving off the integer part). If the language you are using doesn't have that, there is probably some equivalent to a floor() function (round down to the nearest integer) and you would essentially do #-floor(#).

If you have M different tiles, you can theoretically store up to floor(14/log(M)) different maps in a single matrix element, this way. However, I would go with 1 less than that to avoid precision problems. Your code for extracting a given matrix would be to replace the 10 in the above code with M.

I hope to see more ideas :D Encoding data with primes and properties of numbers can have some pretty interesting effects.

--- End quote ---

That's a very interesting method! There are only two small downsides it really has: one is being limited to a maximum of 10 features before it reduces your storage capability by half. But when you use two decimal places for each tile instead of 1 you can store as many as 100(!) different tiles, allowing you to store maps that are more artistic, while still reducing your storage by a factor of 6 (I take 12 decimal places as the limit just to make sure I don't go over it, otherwise the factor would be 7 (14/2)). The second is that this storage method is far less efficient when you use it with maps that don't need 10 different features, but even so it is more efficient than my method since it allows for up to 13 maps to be stored in a single matrix as opposed to my 12 maps which is only in the best-case-scenario of using only 1 terrain feature.

Hmmm... I think I might change the first post a bit to make this thread more of a discussion for different methods to compress maps, and perhaps even to simply 'compression methods' if the need to discuss compression for data other than maps arises.

--- End quote ---

If you use the method explained a little later, you can use a map with something like 5 features and get 20 maps in one matrix. With only 2 features, you can get 46 maps in one matrix because 14/log(2) is about 46.5. I used 10 as an example because it would be easier to think of each layer as a digit in base 10 instead of a digit in base M :P If you wanted to use up to 30 tiles, you could get floor(14/log(30))=9 maps.

But I did think of another way to get a few more maps, just now. Floating Point numbers usually have a separate bit to store whether it is negative or positive, so you actually probably have 2*1014-1 numbers to work with. That is how it works on the TI calculators, anyway (1014-1 negative numbers, 1014-1 positive numbers, and 0).

Ooh, I just had an idea for quickly checking (with your method). For example, a 4-feature map will use 2,3 for the first layer, 5,7 for the second, and so on. Then to check layer 1, simply use gcd() of the matrix element and 6. The second layer, use the matrix element and 35 and so on. Then you just check if the value is {1,2,3,6} or {1,5,7,35} et cetera, corresponding to the tile.

Another neat method for storing maps is to use dynamic tiles that change based on the tiles it is in contact with. For example, a trick I used in TI-BASIC for storing map data for a maze was to store to a picture (black and white) the map. When drawing a tile, it would test the corresponding pixel, then check above, below, and to either side to see precisely which tile was needed. In this way, you could store a tilemap using a single bit for each tile, using 16 tiles, which is a rather crazy level of compression. In normal circumstances, you need at least 4 bits for each tile if you have 16 tiles (2^4=16). When I took remade it in assembly, it compressed a 32*32 map (1024 tiles) to 128 bytes, using 16 unique tiles. In pure TI-BASIC I think the best compression I could get would be using a 19-element list (a little over 171 bytes).

Here is a screenshot of one of the many offshoots I made using this technique:

sjasogun1:
Holy necropost batman!

Yeah yeah, I'm resurrecting this from its grave, but since the last time I visited here I've studied Computer Science at a university for a couple of years so I wanted to make an addendum to this technique in case anybody is still interested. Note that this is still interesting no matter what kind of calculator you are using, even if you don't use matrices.

So, the entire point of this compression in the first place was to reduce the wasted space from numbers having a much larger range than what you are actually making use of. My initial idea was to use prime numbers, but even that's wasteful is what I've found now that I have more experience with this sort of stuff. I won't touch on other compression techniques that rely on encoding here as a quick search on the site for "compression" already reveals plenty of topics about that.

So, the best way to actually do this is to use binary numbers instead. Take the number of tiles you have as N1 and calculate B1 = Floor(log2(N1)) as the number of bits you need to store your tiles. There's no need to use this number just yet, just store your tiles as the regular numbers in a data structure of your choice, like matrices. Be warned though that this only works if your tile id's start at 1 and increment by 1 - so for example if you have N=6 then the ID's for the tiles would be in the inclusive [1...6] range, leaving the 0 spot as a special 'this is not a tile' case that'll allow you to stack maps of different sizes using this method.

The trick kicks in when you want to store additional maps. Calculate the same B-value for your second map, so B2 = Floor(log2(N2)). Now calculate the sum of all previous B-values, Bsum,1 = B1 in this case since there is only one previous B-value. Make sure not to count the B-value of the current map! Now, multiply all of your tile ID's (again, the same range should be used) by 2Bsum,1 and add each of these adjusted tile ID's to the ones of the first map. Of course, you can keep repeating this for more maps, for example with Bsum,2 = B1 + B2 for the large map. Now you can retrieve any map from this compound map by dividing by 2Bsum,[map number - 1] (the same number the original ID's were multiplied by) and taking the Floor() of the result. So for the second map this would be Floor(ID / 2Bsum,1)

The way this works is by relying on binary numbers. If you don't know what these are I'm sure there are good primers on them somewhere on the internet so I'm not gonna bother explaining them right here. Basically what this math does is calculate what the least amount of bits is that can store all your tile ID's (B) and shift every consecutive map over by the sum of all B-values for the previous maps.
So let's take N1 = 6 and N2 = 21 as an example. B1 = 3 which is correct since the highest ID would be 6 which is 110 in binary, taking up the predicted 3 bits. B2 = 5, also correct since 21 is 10101 in binary. Now multiplying by 2 in binary has the interesting property of shifting all the bits one space to the left - 6*2=12 and indeed 12 in binary becomes 1100. So when multiplying the tile ID's for the second map by 2Bsum,1 = 2B1 = 23 all the bits will shift three spaces to the left, ensuring that there are always three unused bits all the way on the right - which are precisely filled by the bits that form the tile ID's of the first map! No matter which tile ID's are transformed this way those bits will never be overwritten, thus preserving both maps. That this works for multiple maps is easy to see, as the next map will have its ID's shifted Bsum,2 = B1 + B2 = 3 + 5 = 8 places to the left, leaving room for both previous maps.

This is much more efficient than the prime number conversion, and simpler to boot since you don't have to make a prime number calculator and can simply use built-in logarithms for the math instead. As a comparison, the prime-number method could only store 10 maps with 2 features this way, while this method can store 46 bits in total with 14 decimals of precision in the floating point numbers, which means a whopping 23 maps with 2 features as opposed to the 10 from the old method (Or 46 if you use 0 as traversible space instead of a special 'this is outside of the map' value)

This method of compression can be used for other purposes as well - you could even use it on the table of N-values for each map to combine all of those! Simply treat each N-value as a map with only 1 tile ID and use the value itself for the N-value of this imaginary map and you're set! You could store all N-values for all maps that fit in one matrix/other data structure in a single number this way if you wanted to. And for the true geeks who want to compress even further you could even use other compression techniques on this already compressed map to compress it even further, though the gains will be lessened as is usually the case when you're stacking compression methods.

Well, there's my first post to omnimaga in I think over three years. If this is too necro-y feel free to just move this to another thread altogether, but I felt that it was appropriate to post this improved version of a trick I came up with three years ago (but which has been widely known in the much improved version in this post to most computer scientists for decades) in the same thread I originally posted it. Hope this helped someone!

Geekboy1011:
That's quite a nice writeup! I'm sure someone will make use of it seeing as this whole topic is interesting! Also definitely the right place for it. Nice to have you around again!

Navigation

[0] Message Index

[*] Previous page

Go to full version