Matt Gerrans
Posts: 1153
Nickname: matt
Registered: Feb, 2002
|
|
Re: Word Frequency Data
|
Posted: Mar 6, 2003 2:58 PM
|
|
For this task, I'd use a StreamTokenizer and a Map with a HashMap implementation. The WFreqRecord class seems unnecessary to me.
Each key in the Map would be a Strings that contains a word. The values would contain the count (or an object with more detailed information, like where it was found -- this could be used to build an indexing engine that you could sell to Google for several million dollars). You didn't say anything about case-sensitivity (do "Word" and "word" count as two different words?), but if you want case-insensitivity, you could use the lower-cased word as the key. If the entry doesn't exist, add it with a value of 1, otherwise increment its value (search this forum for "LetterFrequencyThingy" to find a very similar solution to a homework assignment that did much the same thing with individual letters).
You need to decide how words are delimited (by spaces only, or punctuation, etc.) and set up your StreamTokenizer accordingly.
By the way, it isn't good practice to throw exceptions from main(). It is better to catch the exception(s), determine the cause, deal with it, or show a useful message -- that is the point of checked exceptions.
|
|