Piglet
02-14-2009, 05:25 PM
I have spent some time wondering about how to compress a dictionary for scrabble or such. The code doesn't need a definition of the word, it just needs to know if the word exists. I've tried several methods but the 'bit-pair' route seems the best.
Basically, start with 26 pointers to tables. Each table entry is 2 bits thus:
00 = does not exist
01 = exists but cannot end a word
10 = exists and may end a word
11 = exists and may or may not end a word.
The table for the third letter is made from the non 00 entries of the last table and is situated after the second letter table. You can easily calculate the size.
I noted that after the third letter, the tables get very small very fast so often it's just 1 or 2 entries.
I have never seen this method used, but then I lead a sheltered life. Anyone know this route already or a better one?
Basically, start with 26 pointers to tables. Each table entry is 2 bits thus:
00 = does not exist
01 = exists but cannot end a word
10 = exists and may end a word
11 = exists and may or may not end a word.
The table for the third letter is made from the non 00 entries of the last table and is situated after the second letter table. You can easily calculate the size.
I noted that after the third letter, the tables get very small very fast so often it's just 1 or 2 entries.
I have never seen this method used, but then I lead a sheltered life. Anyone know this route already or a better one?