Pure Programmer
Blue Matrix


Cluster Map

Project: Frequency Table (Words)

Write a program to generate a table of word frequencies. The program should accept a stream on stdin and total the number of each word seen. Once the stream has been read the program should print a tab-delimited table with columns for the Word, Count, and Frequency. The frequency of each word is the count for that word divided by the total number of words in the file.

You can get books from the [[Project Gutenberg]] site to use in testing your program.

Output
$ python FrequencyTableWords.py < ../../data/text/GettysburgAddress.txt Traceback (most recent call last): File "/Users/rich/Desktop/Resources/pureprogrammer/py/projects/FrequencyTableWords.py", line 18, in <module> words = line.split(re.compile(r'\W+')) TypeError: must be str or None, not re.Pattern $ python FrequencyTableWords.py < ../../data/text/USConstitution.txt Traceback (most recent call last): File "/Users/rich/Desktop/Resources/pureprogrammer/py/projects/FrequencyTableWords.py", line 18, in <module> words = line.split(re.compile(r'\W+')) TypeError: must be str or None, not re.Pattern $ python FrequencyTableWords.py < ../../data/text/UnicodeTest.utf8 Traceback (most recent call last): File "/Users/rich/Desktop/Resources/pureprogrammer/py/projects/FrequencyTableWords.py", line 18, in <module> words = line.split(re.compile(r'\W+')) TypeError: must be str or None, not re.Pattern

Solution