New Application for DAWG
I have developed an anagram finder program called Anagram Free Finder that uses this code. It contains about 400 lines of code and has a useful GUI. It will find all anagrams of equal or lesser length to the word you type in. It displays results instantly when you type a key.
Please visit it at
http://code.msdn.microsoft.com/anagrams. It is the best way to actually see a demonstration of this 1985 technology.
Big Update May 2008!
The DawgKit program was cleaned up substantially and is in better shape. Not a lot of real algorithm changes. The code is better
and more factored. It uses more interfaces and some of the supporting code is much better.
Introduction
I have written code in C# that implements the
Directed Acyclic Word Graph algorithm and data stucture. This sort of data structure is used to design word games such as Scrabble or anagrams and also has usage in
scientific pursuits in biology and genetics.
This code page is meant to serve as an outpost for this code. I have a complete implementation in C#. The DAWG algorithms have a dramatic
performance advantage over hash tables, sometimes occupying 25% of the memory and also having faster performance. I think these source files can find utility in a broader range of projects, such as a spell-checker, or other string analysis or pattern matching tools.
Important Resources
Implementation details can be found at
http://dotnetperls.com/Content/Directed-Acyclic-Word-Graph.aspx.
I will be optimizing/enhancing this software in a series. Read the series by looking at the first link to Directed-Acyclic-Word-Graph.aspx. There is a series you can continue reading from there. Finally there is a page with benchmarks and a conclusion. Here are a couple of the results graphs. The DAWG and the Dictionary are the two on the left. The Bloom filter is another structure but it doesn't store all the same data, which is why it is smaller.
My series also includes a list of things I have learned about writing fast C# code. I hope to add more to this page and organize this stuff better. Other methods I used in implementing DawgKit include Enumerable.Range in LINQ
http://dotnetperls.com/Content/LINQ-Enumerable-Range.aspx and also a TextBox trick to improve the UI.
http://dotnetperls.com/Content/TextBox-AppendText-Tip.aspx It is my goal to keep improving this software occasionally, as well as the articles I have written about it. It is fairly complex but I believe it can greatly benefit some software out there.