Pure Programmer
Blue Matrix


Cluster Map

Project: ChaCha20 Cipher

[[Salsa20|ChaCha20]] is a particular efficient type of stream cipher to use in encrypting data. It is a form of symmetric encryption in that the same password is used for encryption and decryption.

Write a program that reads in a file and writes out an encrypted form of the file to another file. It turns out that because XOR is its own inverse, the same algorithm with the same password will decrypt the file so you only need to run the program again on the encrypted file to decrypt it.

To encrypt or decrypt the file simply read a character from a binary mode file, XOR it with the corresponding character in the ChaCha20 keyblock, then write the result out to a binary file. Once you have used all the characters in the keyblock, generate the next keyblock using the ChaCha20 algorithm.

The program should take three arguments on the command line: name of input file, name of output file, and the password to use. Duplicate the password enough times to get the 32 bytes needed for the ChaCha20 key. Padd the password with "1234567" and then truncate to 8 bytes and use as the nonce. Usually the nonce will be set to a random value when used as a communications stream cipher.

For example. Suppose you had a cleartext file called cleartext.txt that contained the following paragraph.

Orson Welles (May 6, 1915 - October 10, 1985) was an American actor, director, screenwriter and producer who is remembered for his innovative work in radio, theatre and film.

Using the Linux hexdump command we can see the values of the bytes in the file in both hexadecimal notation and as characters on the right.

% hexdump -C cleartext.txt 00000000 4f 72 73 6f 6e 20 57 65 6c 6c 65 73 20 28 4d 61 |Orson Welles (Ma| 00000010 79 20 36 2c 20 31 39 31 35 20 2d 20 4f 63 74 6f |y 6, 1915 - Octo| 00000020 62 65 72 20 31 30 2c 20 31 39 38 35 29 20 77 61 |ber 10, 1985) wa| 00000030 73 20 61 6e 20 41 6d 65 72 69 63 61 6e 20 61 63 |s an American ac| 00000040 74 6f 72 2c 20 64 69 72 65 63 74 6f 72 2c 20 73 |tor, director, s| 00000050 63 72 65 65 6e 77 72 69 74 65 72 20 61 6e 64 20 |creenwriter and | 00000060 70 72 6f 64 75 63 65 72 20 77 68 6f 20 69 73 20 |producer who is | 00000070 72 65 6d 65 6d 62 65 72 65 64 20 66 6f 72 20 68 |remembered for h| 00000080 69 73 20 69 6e 6e 6f 76 61 74 69 76 65 20 77 6f |is innovative wo| 00000090 72 6b 20 69 6e 20 72 61 64 69 6f 2c 20 74 68 65 |rk in radio, the| 000000a0 61 74 72 65 20 61 6e 64 20 66 69 6c 6d 2e 0a |atre and film..|

After running our program to generate the ciphertext.bin file using the password ‘Rosebud’ we would see the following scrambled contents.

$ hexdump -C ciphertext.bin 00000000 cb 61 60 d1 79 47 32 fa a8 de 54 5b 3e 38 2f 02 |.a`.yG2...T[>8/.| 00000010 44 5e ca aa 9f 48 0e 0d 6d 0a 31 57 06 6c 8a 6b |D^...H..m.1W.l.k| 00000020 d5 b1 5f 40 87 68 bf e1 08 66 99 e6 e3 14 e4 97 |.._@.h...f......| 00000030 58 f1 0c 00 0b e6 3a 32 e4 2c f1 16 59 0f 0a b5 |X.....:2.,..Y...| 00000040 b0 1c 78 10 34 6f 65 dd 84 31 40 9a 55 82 06 51 |..x.4oe..1@.U..Q| 00000050 23 83 39 11 c7 ce a6 f8 68 cd 92 80 2d 79 9f b1 |#.9.....h...-y..| 00000060 9c 2a b2 0e f9 5c 4c b1 1d fc b7 8b 28 0c f1 3f |.*...\L.....(..?| 00000070 0f 3f 07 92 28 cf 02 20 17 28 57 51 8e 71 cd 79 |.?..(.. .(WQ.q.y| 00000080 1f 64 8a b7 5f 30 42 ab 50 a1 57 7e 3a bb e4 f2 |.d.._0B.P.W~:...| 00000090 17 a2 55 ee 2c b1 cb 84 f9 7b ad cf ef 8c 85 ca |..U.,....{......| 000000a0 c6 91 a3 c6 51 0d 5e 57 01 de f5 6b e8 05 c2 |....Q.^W...k...|

Here are the commands to encrypt, decrypt and compare the final results:

$ ./ChaCha20Cipher cleartext.txt ciphertext.bin Rosebud $ ./ChaCha20Cipher ciphertext.bin cleartext_decrypt.txt Rosebud $ diff cleartext.txt cleartext_decrypt.txt

The diff command shouldn’t print any differences. On Windows use the “comp” command instead of “diff”.

Even though this can be an efficient encryption technique what are the chances that it can be deciphered? Why is this? What is the worst case?

See [[ChaCha20 Cipher Reference]] for a reference implementation.

Output
$ javac -Xlint ChaCha20Cipher.java $ java -ea ChaCha20Cipher ../../data/text/GettysburgAddress.txt output/testGettysburgAddress.cc20 AbrahamLincoln $ javac -Xlint ChaCha20Cipher.java $ java -ea ChaCha20Cipher output/testGettysburgAddress.cc20 output/testGettysburgAddress.cc20.txt AbrahamLincoln $ diff ../../data/text/GettysburgAddress.txt output/testGettysburgAddress.cc20.txt $ javac -Xlint ChaCha20Cipher.java $ java -ea ChaCha20Cipher ../../data/text/UnicodeTest.utf8 output/testUnicodeTest.cc20 Unicode $ javac -Xlint ChaCha20Cipher.java $ java -ea ChaCha20Cipher output/testUnicodeTest.cc20 output/testUnicodeTest.cc20.utf8 Unicode $ diff ../../data/text/UnicodeTest.utf8 output/testUnicodeTest.cc20.utf8

Solution