Pure Programmer
Blue Matrix


Cluster Map

Project: Frequency Table (Words)

Write a program to generate a table of word frequencies. The program should accept a stream on stdin and total the number of each word seen. Once the stream has been read the program should print a tab-delimited table with columns for the Word, Count, and Frequency. The frequency of each word is the count for that word divided by the total number of words in the file.

You can get books from the [[Project Gutenberg]] site to use in testing your program.

Output
$ node FrequencyTableWords.js < ../../data/text/GettysburgAddress.txt Word Count Freq 1863 1 0.003484 19 1 0.003484 a 7 0.024390 above 1 0.003484 add 1 0.003484 address 1 0.003484 advanced 1 0.003484 ago 1 0.003484 all 1 0.003484 ... war 2 0.006969 we 10 0.034843 what 2 0.006969 whether 1 0.003484 which 2 0.006969 who 3 0.010453 will 1 0.003484 work 1 0.003484 world 1 0.003484 years 1 0.003484 $ node FrequencyTableWords.js < ../../data/text/USConstitution.txt Word Count Freq 1 21 0.002750 10 1 0.000131 10th 1 0.000131 11th 1 0.000131 12th 1 0.000131 13th 1 0.000131 14th 1 0.000131 15th 2 0.000262 16th 1 0.000131 ... would 2 0.000262 writ 1 0.000131 writing 1 0.000131 writings 1 0.000131 writs 2 0.000262 written 5 0.000655 year 10 0.001310 years 23 0.003012 yeas 2 0.000262 york 2 0.000262 $ node FrequencyTableWords.js < ../../data/text/UnicodeTest.utf8 Word Count Freq 10 1 0.006329 11 1 0.006329 4 2 0.012658 7 3 0.018987 8 1 0.006329 9 1 0.006329 a 3 0.018987 ago 1 0.006329 all 1 0.006329 ... vier 1 0.006329 vor 1 0.006329 vorschlag 1 0.006329 vs 1 0.006329 vulcan 1 0.006329 with 3 0.018987 woozy 1 0.006329 wurde 1 0.006329 years 1 0.006329 zany 1 0.006329

Solution