Project: Frequency Table (Words)
Write a program to generate a table of word frequencies. The program should accept a stream on stdin and total the number of each word seen. Once the stream has been read the program should print a tab-delimited table with columns for the Word, Count, and Frequency. The frequency of each word is the count for that word divided by the total number of words in the file.
You can get books from the [[Project Gutenberg]] site to use in testing your program.
Output
$ perl FrequencyTableWords.pl < ../../data/text/GettysburgAddress.txt
Word Count Freq
1863 1 0.003484
19 1 0.003484
a 7 0.024390
above 1 0.003484
add 1 0.003484
address 1 0.003484
advanced 1 0.003484
ago 1 0.003484
all 1 0.003484
...
war 2 0.006969
we 10 0.034843
what 2 0.006969
whether 1 0.003484
which 2 0.006969
who 3 0.010453
will 1 0.003484
work 1 0.003484
world 1 0.003484
years 1 0.003484
$ perl FrequencyTableWords.pl < ../../data/text/USConstitution.txt
Word Count Freq
1 21 0.002750
10 1 0.000131
10th 1 0.000131
11th 1 0.000131
12th 1 0.000131
13th 1 0.000131
14th 1 0.000131
15th 2 0.000262
16th 1 0.000131
...
would 2 0.000262
writ 1 0.000131
writing 1 0.000131
writings 1 0.000131
writs 2 0.000262
written 5 0.000655
year 10 0.001310
years 23 0.003012
yeas 2 0.000262
york 2 0.000262
$ perl FrequencyTableWords.pl < ../../data/text/UnicodeTest.utf8
Word Count Freq
10 1 0.003802
11 1 0.003802
4 1 0.003802
4スコアと7年前 1 0.003802
7 2 0.007605
8 1 0.003802
9 1 0.003802
a 1 0.003802
ago 1 0.003802
...
우리 1 0.003802
인이 1 0.003802
잉태 1 0.003802
전 1 0.003802
점 1 0.003802
제안했습니다 1 0.003802
조상들은 1 0.003802
창조되었다고 1 0.003802
평등하게 1 0.003802
한 1 0.003802