Omnimaga

General Discussion => Other Discussions => Math and Science => Topic started by: Munchor on August 02, 2011, 09:11:10 am

Title: Loop all possible words algorithm
Post by: Munchor on August 02, 2011, 09:11:10 am
I'm wondering about how to loop all possible words like:

Code: [Select]
a
b
c
...
aa
ab
ac
ad
...
bce
...

I know this is a CPU Consuming algorithm, but I'm wondering of how to replicate it in C. I'm not asking for C code or C help, but simply the algorithm in general words. Thanks, also if this algorithm has a name, please tell me ;)
Title: Re: Loop all possible words algorithm
Post by: AngelFish on August 02, 2011, 10:28:02 am
Well, you could allocate an array, then do an infinite loop consisting of effectively

Code: [Select]

int N = max_word_length;
int *A = malloc (sizeof (int) * N);
char c;
int i;

while (1) {
A[i] = (A[i]+1)%(26+61);

}

Basically, you can increment the ascii values (And wrap them around with the modulo operator). If the ascii value in any particular byte overflows and wraps around to 0, then increment the next higher byte in the array. It's essentially treating the words as big numbers and adding them.
Title: Re: Loop all possible words algorithm
Post by: Happybobjr on August 02, 2011, 11:02:49 am
sorry i don't know much abut C nor it's syntax but in a psudo code you could do something like this.

set String0 = ""
set string1 = "a"
set string2 = "b"
....

for (%var, 0, 26)
for (%var2, 1, 26)
set strings = "string"+"%var%"+"%var2%"
if strings = yourword
do bla bla bla.
end
end.

sorry for having all the different languages in there :P it is a mix of ti-basic and windows batch thingy.
Title: Re: Loop all possible words algorithm
Post by: harold on August 02, 2011, 11:36:08 am
Qwerty, you put the parentheses wrong.
Also if you do it exactly like that, you can't get the "carry"
The basic technique is valid though of course
Title: Re: Loop all possible words algorithm
Post by: AngelFish on August 02, 2011, 11:46:51 am
Well, it was pseudo-code more than anything else.
Title: Re: Loop all possible words algorithm
Post by: calc84maniac on August 02, 2011, 11:48:10 am
Well, it was pseudo-code more than anything else.
I have no idea what's with the "char c;" though :P
Title: Re: Loop all possible words algorithm
Post by: Munchor on August 02, 2011, 11:54:10 am
Code: [Select]
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
  int N = 200;
  int *A = malloc (sizeof (int) * N);
  char c;
  int i;

  while (1)
  {
    A[i] = (A[i]+1)%(26+61);
  }

  return 0;
}

This gives segmentation fault, GDB is not very helpful.


Following happybojr's idea, I made this in C++:.

Code: [Select]
#include <iostream>
#include <string>

using namespace std;

int main(int argc, char *argv[]);

int main(int argc, char *argv[])
{
  string all_chars = "abcdefghijklmnopqrstuvwxyz";
 
  int i, u;
  string finalString;
  for (i=0; i<26; i++)
  {
    for (u=0; u<26; u++)
    {
      finalString.append( all_chars.substr(i,1) );
      finalString.append( all_chars.substr(u,1) );
      cout << finalString << endl;
    }
  }
 
  return 0;
}

It works more or less, here's the beginning of the output:

Quote
aa
aaab
aaabac
aaabacad
aaabacadae
aaabacadaeaf
aaabacadaeafag
aaabacadaeafagah
aaabacadaeafagahai
aaabacadaeafagahaiaj
aaabacadaeafagahaiajak
aaabacadaeafagahaiajakal
aaabacadaeafagahaiajakalam
aaabacadaeafagahaiajakalaman
aaabacadaeafagahaiajakalamanao
aaabacadaeafagahaiajakalamanaoap
aaabacadaeafagahaiajakalamanaoapaq
aaabacadaeafagahaiajakalamanaoapaqar
aaabacadaeafagahaiajakalamanaoapaqaras
aaabacadaeafagahaiajakalamanaoapaqarasat
aaabacadaeafagahaiajakalamanaoapaqarasatau
Title: Re: Loop all possible words algorithm
Post by: AngelFish on August 02, 2011, 11:56:27 am
The string length shouldn't be increasing that quickly. You should be getting a, b, c, d, ..., x, y, z, aa, ab, ac, ad, ..., zx, zy, zz, aab, etc...
Title: Re: Loop all possible words algorithm
Post by: Munchor on August 02, 2011, 11:57:36 am
The string length shouldn't be increasing that quickly. You should be getting a, b, c, d, ..., x, y, z, aa, ab, ac, ad, ..., zx, zy, zz, aab, etc...

I know, hence I'm asking for help, I don't quite see how to make it, I don't even need code, just thoughts :)
Title: Re: Loop all possible words algorithm
Post by: AngelFish on August 02, 2011, 12:03:08 pm
well, the general algorithm I suggested will get you that. I'm not surprised that the "code" I gave you segfaults because I didn't mean it as actual code :P

My idea was that you take an array of ascii characters and for each iteration, you increment the value of the lowest character in the array. If that character value overflows (goes above the ascii value of "z"), you change the value of that character back to the value of "a" and add one to the next character in the array (basically a carry bit like in binary addition). If that value overflows too, repeat the process until the characters stop overflowing.
Title: Re: Loop all possible words algorithm
Post by: jnesselr on August 02, 2011, 12:48:40 pm
Instead of an array, try something else.  If you are kinda good with number systems, think of it this way.  You have a base 26 number.  If you have a function that converts that number into the correct string, then you just have to constantly increase the number by 1 each time.
Title: Re: Loop all possible words algorithm
Post by: calc84maniac on August 02, 2011, 12:51:05 pm
Instead of an array, try something else.  If you are kinda good with number systems, think of it this way.  You have a base 26 number.  If you have a function that converts that number into the correct string, then you just have to constantly increase the number by 1 each time.

Actually, that method doesn't take length of strings into account. 0 would represent ...AAAAAAAA, 1 would represent ...AAAAAAB, etc.
Title: Re: Loop all possible words algorithm
Post by: Munchor on August 02, 2011, 12:52:55 pm
Code: [Select]
#include <iostream>
#include <algorithm>
#include <vector>
#include <iterator>
#include <fstream>
 
using namespace std;
 
int main()
{
  string all_chars_string = "abcdefghijklmnopqrstuvwxyz";
  vector<string> all_chars;

  int i;
  for (i=0; i<all_chars_string.length(); i++)
  {
    all_chars.push_back( all_chars_string.substr(i, 1) );
  }

  /* Write words to the file */
  ofstream myfile;
  myfile.open ("words.txt");
  for (i=0; i<all_chars_string.length(); i++)
  {
    while ( next_permutation(all_chars.begin(), ( all_chars.end() - (all_chars_string.length()-i) ) ) )
    {
      copy(all_chars.begin(), ( all_chars.end() -(all_chars_string.length()-i) ) , ostream_iterator<string>(myfile,""));
      myfile << "\n";
      cout << endl;
    }
  }
  myfile.close();
  return 0;
}

This is not quite it, but it's pretty good, in a few seconds it generated a 500000 lines of words, what it does is:

All possible words with A
All possible words with A, B
All possible words with A, B, C
All possible words with A, B, C, D

Do you have any idea of how to change it to make all possible words*?
Title: Re: Loop all possible words algorithm
Post by: ztrumpet on August 02, 2011, 01:06:38 pm
How about some TI Basic code: ;D
Code: [Select]
:ClrHome
:For(A,0,7
:Disp " //No spaces, just a quote
:End
:1->B
:1->C
:Delvar L1 16->dim(L1 //Omit the space between 'L1' and the '1' of "16" when typing on actual calc
:Repeat getKey or B=17
:If C
:1+L1(B->L1(B
:For(A,1,B
:Output(8,A,sub("ABCDEFGHIJKLMNOPQRSTUVWXYZ",L1(A),1
:End
:If 26=L1(B
:Then
:While 26=L1(max(1,A-1
:A-1->A
:1->L1(A
:End
:If A=1
:Then
:B+1->B
:Else
:DelVar C1+L1(A-1->L1(A-1
:End
:End
:End

Finally!  That only took 30 minutes to debug. O.O
Title: Re: Loop all possible words algorithm
Post by: calc84maniac on August 02, 2011, 01:09:58 pm
Actually, I guess graphmastur's method works if you keep track of the word length as well as the increasing value. This code works for strings up to 13 characters (since 26^14 is greater than the 64-bit integer maximum value):
Code: [Select]
#include <stdio.h>

void displayWord(unsigned long long word, int n) {
char* buffer = malloc(n+1);
buffer[n] = '\0';
while (n)
{
buffer[--n] = word%26 + 'A';
word /= 26;
}
printf("%s\n", buffer);
free(buffer);
}

int main()
{
int length = 1;
unsigned long long word = 0;
unsigned long long limit = 26;
while (1) {
displayWord(word++, length);
if (word == limit)
{
word = 0;
length++;
limit *= 26;
}
}
}
Title: Re: Loop all possible words algorithm
Post by: harold on August 02, 2011, 01:20:41 pm
This probably won't help because it's an inefficient algorithm, but here's some Haskell that does what you want:
Code: [Select]
allLetters = ['a'..'z']
allWords = concat [wordsOfLen x | x<-[1..]]
           where
                wordsOfLen 1 = [[x] | x<-allLetters]
                wordsOfLen len = [x : y | x<-allLetters, y<-wordsOfLen (len - 1)]
Title: Re: Loop all possible words algorithm
Post by: Ashbad on August 02, 2011, 02:20:36 pm
Well, in a C/C++ way of thinking, a way would possibly be something like this:

Code: [Select]
void*GetPossibleWords(unsigned int amount) {
  unsigned int length=0;
  unsigned int char_length=1;
  unsigned int placeholder=amount;
  int placeholder2=1;
  while (placeholder) {
    length+=placeholder^26*placeholder2++;
    placeholder=(placeholder>=0)?(placeholder/26):(0);
  }
  void*memalloc=malloc(length);
  void*returnalloc=memalloc;
  *(memalloc)=length;
  memalloc+=sizeof(unsigned int);
  char*chars=malloc(placeholder2);
  *(chars)='a';
  int k;
  for(int j=1;j<=char_length;j++) {
    memcop(chars,memalloc,char_length);
    k=char_length;
    while(k) {
      *(chars+k-1)++
      if (*(chars+k-1)=='z') {
        *(chars+(k--)-1)='a';
      } elseif (*(chars)=='z') {
        char_length++;
        for(int l=0;l<=char_length-1;l++) {
          *(chars+l)='a'; }
      } else { k=0; }
    } }
  return returnalloc;
}

This is really rough, since I kinda just drafted it out in <5 minutes with no testing whatsoever :P.  In fact, i think its only like 80% complete, since I really was rushed making it.  What the purpose of this is to take an amount of words to process and output a pointer (with the sizeof(int) first bytes holding the size of the returned allocated memory) pointing to a chunk in memory with the words in lexical order.

Edit: haha, just realized this is really broken.  Ignore that it wont loop through more than once, just keep in mind the purpose I was aiming towards, since this was kinda a step in the right direction..
/me goes back to making a much smaller and working version of this in Haskell..

Edit2: was I seriously ninja'd by Haskell code? <.<
Title: Re: Loop all possible words algorithm
Post by: DJ Omnimaga on August 02, 2011, 02:28:54 pm
Lol actually I found that quite funny because you mentionned you like Haskell a lot in another topic, yet you get ninja'd by Haskell code while posting using a different programming language. ;D
Title: Re: Loop all possible words algorithm
Post by: nemo on August 02, 2011, 03:06:45 pm
here's my attempt, though my use of pointers is probably pretty ugly since i just used an online tutorial as a reference for that part.

Code: [Select]
#include <stdio.h>

unsigned long pow(unsigned long base, unsigned long exp)
{
    if(exp == 1)
        return base;
    return base * pow(base, exp - 1);
}

char *GetString(int stringLength, unsigned long combination)
{
    char* character = malloc(sizeof(char) * 2);
    *character = combination % 26 + 'a';
    *(character + 1) = '\0';
    if(stringLength == 1)
        return character;
    return strcat(GetString(stringLength - 1, combination / 26), character);
}

int main(int argc, int *argv[])
{
    const int MAX_LEN = 3;
    int len;
    unsigned long i;
    for(len = 1; len <= MAX_LEN; len++)
    {
        for(i = 0; i < pow(26, len); i++)
        {
            printf("%s\n", GetString(len, i));
        }
    }
    return 0;
}
Title: Re: Loop all possible words algorithm
Post by: phenomist on August 05, 2011, 03:20:25 am
Attempt to create a short solution that fails epically (in TI-BASIC)

Darn my TI-BASIC coding skills are rather rusty. Didn't test oncalc, sorry.

Code: [Select]
:ClrHome
:For(A,1,8
:Disp "
:End
:For(A,1,1E99
:" ->Str1
:A->B
:While B
:round(26fPart(B/26->C
:If C
:Then
:sub("ABCDEFGHIJKLMNOPQRSTUVWXY",C,1)+Str1->Str1
:Else
:"Z"+Str1->Str1
:B-26->B
:End
:iPart(B/26->B
:End
:Output(8,1,Str1
:Disp "
:End

Slightly shorter than ztrumpet's? :P

idk, but I think it's quite a bit faster
Title: Re: Loop all possible words algorithm
Post by: Horrowind on August 05, 2011, 07:04:11 am
well.... my two cents using recursion:
Code: [Select]
#include "iostream"
#include "string"
void allwords(string s, int i) {
for(char j = 97; j <123; j++) {
if(i > 1)
allwords(s + j, i-1);
else
std::cout<<s+j<<'\n';
}
}
int main(void) {
int i = 0;
while(1) {
allwords("", i);
i++;
}
}
Title: Re: Loop all possible words algorithm
Post by: Jim Bauwens on August 05, 2011, 10:38:50 am
Here is a verson in Lua ;D
Code: (Lua) [Select]
a="a"

function increment(str)
        len = #str
        if len==0 then
                return "a"
        end
        bt = str:byte(len)
        if bt<122 then
                str = str:sub(1, len - 1) .. string.char(bt+1)
        else
                str = increment(str:sub(1, len - 1)) .. "a"
        end
        return str
end

print(a)
for i=1,100 do
 a=increment(a)
 print(a)
end
Title: Re: Loop all possible words algorithm
Post by: Munchor on August 05, 2011, 10:52:30 am
Code: [Select]
def increment(x):
    """Increments the string x"""
    length_of_string = len(x)
    if length_of_string == 0:
        return "a"
    bt = ord(x)
    if bt < 122:
        x = x[1:-1] + chr(bt+1)
    else:
        x = increment(x[1:-1] + "a")
    return x
   
def main():
    """Displays all possible words using alphabet a-z"""
    a = "a"
   
    for i in range(1000):
        a = increment(a)
        print a
 
if __name__ == "__main__":
    main()

Heh Jim, I tried to copy cat your version to Python, but failed :P
Title: Re: Loop all possible words algorithm
Post by: Levak on August 05, 2011, 11:05:35 am
Algorithm
Code: [Select]
import Levak

def main():
    for words in Levak.dictionnary:
        print words

Output
Code: [Select]
42


Hum, I'm missing something ...
Title: Re: Loop all possible words algorithm
Post by: Jim Bauwens on August 05, 2011, 11:18:51 am
ephan, I'll give you a tip: Lua starts with 1, python with 0 :)
(/me waits for the beating)

@Levak, lol
Title: Re: Loop all possible words algorithm
Post by: Munchor on August 05, 2011, 11:25:20 am
ephan, I'll give you a tip: Lua starts with 1, python with 0 :)
(/me waits for the beating)

@Levak, lol

Jim, I did not understand your algorithm, I just copied so I don't know where it is wrong.

Code: [Select]
def increment(x):
    """Increments the string x"""
    length_of_string = len(x)
    if length_of_string == 0:
        return "a"
    bt = ord(x) #The problem is that ord doesn't work for strings, only characters
    if bt < 122:
        x = x[-1] + chr(bt+1)
    else:
        x = increment(x[-1] + "a")
    return x
   
def main():
    """Displays all possible words using alphabet a-z"""
    a = "a"
   
    for i in range(1000):
        a = increment(a)
        print a
 
if __name__ == "__main__":
    main()

So I tried:

Code: [Select]
def increment(x):
    """Increments the string x"""
    length_of_string = len(x)
    if length_of_string == 0:
        return "a"
    bt = x.encode("hex")
    if bt < 122:
        x = x + chr(bt+1)
    else:
        x = increment(x + "a")
    return x
   
def main():
    """Displays all possible words using alphabet a-z"""
    a = "a"
   
    for i in range(1000):
        a = increment(a)
        print a
 
if __name__ == "__main__":
    main()

But I get recursion errors.
Title: Re: Loop all possible words algorithm
Post by: ztrumpet on August 05, 2011, 12:25:32 pm
Slightly shorter than ztrumpet's? :P

idk, but I think it's quite a bit faster
I'll take that challenge.
I'm going to try to beat you on both speed and size.  Here's to coding! ^-^

Edit: Yours doesn't have any errors; well done.  I really like the algorithm you used.  Also, you probably want to flip the final Disp and Output lines so you can see a little bit more on the screen at once.  Excellent job, though. :D\

Edit 2: I just settled for optimizing yours. ;)  Here's what I came up with:
Code: [Select]
:ClrHome
:For(A,-8,-1
:Disp "
:End
:Repeat getKey
:" ->Str1
:A+1->A
:Ans->B
:While Ans
:round(26fPart(Ans/26
:If Ans
:Then
:sub("ABCDEFGHIJKLMNOPQRSTUVWXY",Ans,1)+Str1->Str1
:B
:Else
:"Z"+Str1->Str1
:B-26
:End
:int(Ans/26->B
:End
:Disp "
:Output(8,1,Str1
:End
Title: Re: Loop all possible words algorithm
Post by: Munchor on August 05, 2011, 02:08:54 pm
Size comparisons ztrumpet? Nice either way
Title: Re: Loop all possible words algorithm
Post by: ztrumpet on August 05, 2011, 02:24:18 pm
Size comparisons ztrumpet? Nice either way
Technically they're the same size, but my version's faster. ;)
Title: Re: Loop all possible words algorithm
Post by: fb39ca4 on August 05, 2011, 02:33:12 pm
so this is basically counting upwards in base 26, using the letters as digits, right?
Title: Re: Loop all possible words algorithm
Post by: Jim Bauwens on August 05, 2011, 03:46:31 pm
ephan, here is a working python version (fixed your code):
Code: (Python) [Select]

def increment(x):
    """Increments the string x"""
    length_of_string = len(x)
    if length_of_string == 0:
        return "a"
    bt = ord(x[-1:])
    if bt < 122:
        x = x[:-1] + chr(bt+1)
    else:
        x = increment(x[:-1]) + "a"
    return x
   
def main():
    """Displays all possible words using alphabet a-z"""
    a = "a"
   
    for i in range(1000):
        a = increment(a)
        print a
 
if __name__ == "__main__":
    main()
Title: Re: Loop all possible words algorithm
Post by: phenomist on August 05, 2011, 05:12:17 pm
Also, pseudocode example for really fast generation:

Code: [Select]
initarray := [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z]
mod := initarray
while 1:{
for loopvar from 1 to dim(mod):{
output mod[loopvar]}
newmod := []
for loopvar from 1 to 26:{
temp := initarray[loopvar] + mod
newmod := newmod + temp
}
mod := newmod
}

ok this really kills the memory, whoops
and you'll have really long pauses for when the computer is calculating the next string length

EDIT: figured out a constant-time algorithm, same memory problems though
pseudocoded for viewing pleasure:
Code: [Select]
list := [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z]
queue := list
while forever{
pop queue => string
print string
concatenate string, list => list2
push list2 into queue
}