Project: Frequency Table (Bytes)
Write a program to generate a table of frequencies. The program should accept a stream on stdin and total the number of each byte seen. Once the stream has been read the program should print a tab-delimited table with columns for Byte Value (decimal), Byte Value (hexadecimal), Count, and Frequency. The frequency of each byte is the count for that byte divided by the total number of bytes in the file. Only print rows in the table for values with a byte count greater than 0.
Output
$ perl FrequencyTableBytes.pl < ../../data/text/USConstitution.txt
Dec Hex Count Freq
10 0a 978 0.0202
32 20 10282 0.2129
34 22 2 0.0000
40 28 5 0.0001
41 29 5 0.0001
44 2c 565 0.0117
45 2d 51 0.0011
46 2e 290 0.0060
48 30 4 0.0001
...
113 71 47 0.0010
114 72 2138 0.0443
115 73 2393 0.0495
116 74 3647 0.0755
117 75 747 0.0155
118 76 416 0.0086
119 77 347 0.0072
120 78 95 0.0020
121 79 492 0.0102
122 7a 31 0.0006
$ perl FrequencyTableBytes.pl < ../../data/text/UnicodeTest.utf8
Dec Hex Count Freq
10 0a 70 0.0276
32 20 243 0.0959
44 2c 11 0.0043
45 2d 2 0.0008
46 2e 7 0.0028
48 30 1 0.0004
49 31 3 0.0012
52 34 2 0.0008
55 37 3 0.0012
...
230 e6 5 0.0020
231 e7 13 0.0051
232 e8 8 0.0032
233 e9 4 0.0016
234 ea 4 0.0016
235 eb 18 0.0071
236 ec 23 0.0091
237 ed 6 0.0024
239 ef 3 0.0012
240 f0 43 0.0170
$ perl FrequencyTableBytes.pl < ../../data/binary/RandomBytes1M.bin
Dec Hex Count Freq
0 00 3869 0.0039
1 01 4021 0.0040
2 02 4006 0.0040
3 03 3886 0.0039
4 04 3930 0.0039
5 05 3886 0.0039
6 06 3865 0.0039
7 07 3965 0.0040
8 08 3870 0.0039
...
246 f6 3965 0.0040
247 f7 3833 0.0038
248 f8 3944 0.0039
249 f9 3846 0.0038
250 fa 3985 0.0040
251 fb 3930 0.0039
252 fc 4008 0.0040
253 fd 4032 0.0040
254 fe 3933 0.0039
255 ff 3815 0.0038