CSCI 102T:: The Socio-Techno Web

Go back to Lab 7 from last week and finish the Hangman exercise if you have not already. You may need to reread the Warmup exercises that explain for-loops again for today's lab. Find a classmate and play each others' games. Work together to make sure that your labs are complete and that everyone feels comfortable with their code.

The use of codes (or ciphers) as a means of hiding the meaning of messages traces its roots to ancient history. The first documented use of codes was by Hebrew scribes in approximately 500 - 600 B.C. The Atbash cipher specified that each letter in a message would be encoded using the corresponding letter in the alphabet reversed. For example, 'A' would be encoded as 'Z', 'B' would be encoded as 'Y', 'C' would be encoded as 'X', and so on. The first known military use of codes was by Julius Caesar in 50 - 60 B.C. The Caesar cipher specified that each letter in the alphabet would be encoded using the letter three later in the alphabet. For example, 'A' would be encoded as 'D', 'B' would be encoded as 'E', 'C' would be encoded as 'F', and so on. The code wraps around at the end of the alphabet, so 'X', 'Y' and 'Z' would be encoded as 'A', 'B', and 'C', respectively.

Both the Atbash and Caesar ciphers are examples of substitution ciphers, which are codes in which one letter of the alphabet is substituted for another. A substitution cipher can be described succinctly by specifying its key, i.e., the sequence of letters to which the alphabet is mapped. The keys for the Atbash and Caesar ciphers are shown below. To encode a specific letter using one of these ciphers, simply find the corresponding letter in the key below it.

    Atbash cipher:                        Caesar cipher:
        ABCDEFGHIJKLMNOPQRSTUVWXYZ            ABCDEFGHIJKLMNOPQRSTUVWXYZ
        ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓             ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
        ZYXWVUTSRQPONMLKJIHGFEDCBA            DEFGHIJKLMNOPQRSTUVWXYZABC

For example, the Atbash cipher would encode the word "CODE" as "XLWV", whereas the Caesar cipher would encode it as "FRGH". Although both of these ciphers were effective at their time (when very few people could read at all), their simple patterns of encoding letters seem pretty obvious today. In principle, though, a substitution cipher can specify any mapping from letters to letters. For example:

    Mystery cipher:                
        ABCDEFGHIJKLMNOPQRSTUVWXYZ 
        ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓  
        QWERTYUIOPASDFGHJKLZXCVBNM

Substitution ciphers have several attractive features. For one, they are relatively simple to understand and use. They are also reasonably effective. There are 26! (that's 26 factorial, or roughly 4 x 10²⁶) different arrangements of the 26 letters in the alphabet. Since each of these arrangements may be used as a key for a substitution cipher, there are 26! different codes that can be used. By selecting one of these keys, the corresponding cipher can be used to encode messages. As long as the recipient of the message has that same key, the message can be easily decoded. Without the key, however, decoding a message can be extremely difficult.

Try a few examples just to make sure you understand before moving on. For each of the three ciphers listed above, compute the corresponding encodings. We'll check your answers using your program later.

message	Atbash	Caesar	Mystery
`ABCDE`
`FOO`
`SECRET`

As the previous exercise showed, encoding messages using a substitution cipher is relatively straightforward. The following steps must take place:

    for as many letters as there are in the original message (starting with the first letter)
        look at the next character in the message
        find its position in the alphabet
        find the corresponding letter in the key
        use that corresponding letter to encode the original character in the message

These steps can be accomplished in JavaScript as a function which takes the key and a message as input parameters and produces (or returns) the encoded message. The Encode function below performs this encoding using the variable coded to "accumulate" the coded message. The variable is initially assigned to be "", which is called the empty string since it contains no letters. As each letter in the message is processed, the appropriate code letter is concatenated (or appended) onto the end of coded. After traversing the entire message, coded will contain the complete encoded version of the message. You did something similar in Hangman to keep track of "letters used." (Note: code displayed in color is explained below.)

    function Encode(key, message)
    // Given  : key is a string consisting of the 26 letters in arbitrary order,
    //          message is the string to be encoded using the key
    // Returns: the coded version of message using the substitution key 
    {
        var alphabet, coded, i, ch, index;

        alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

        coded = "";                                      
        for (i = 0; i < message.length; i++) {        // for as many letters as there are
            ch = message.charAt(i);                   //   access the letter in the message
            index = alphabet.indexOf(ch);             //   find its position in alphabet
            if (index == -1) {                        //   if it's not a letter,
                coded = coded + ch;                   //     then leave it as is & add
            }                                         //   otherwise,
            else {                                    //     find the corresponding
                coded = coded + key.charAt(index);    //     letter in the key & add
            }
        }
        return coded;
    }

The Encode function introduces several new features of JavaScript string variables (shown in color). In JavaScript, a string is a type of object. As opposed to just storing a simple value, an object may encapsulate numerous attributes (i.e., variables) and operations on those attributes (i.e., functions) in a single entity. These attributes and operations can be accessed by specifying the string variable, followed by a period, followed by the name of the attribute or operation. For example, a string variable has an attribute called length which specifies how many characters are stored in the string. In the Encode function, this attribute is accessed using message.length. This value is used to specify the number of for-loop repetitions.

The charAt function on strings will return the character stored at a particular index of a string (recall that the first character in the string is considered to be at index 0, the second character at index 1, and so on). This function is used at two places in the Encode function, to access each character in the message (message.charAt(i)), and to find the corresponding character in the key (key.charAt(index)). Similarly, the indexOf function will return the index of the first occurrence of a character in the string (or -1 if not found). This function is used in Encode to find the location of each character in the alphabet (alphabet.indexOf(ch)).

The following HTML code uses the Encode function to encode messages entered in by the user. The user specifies the subsitution key in a text box and the message to be encoded in a text area. Then, by clicking on a button, the Encode function is called to encode the message, and the resulting code is displayed in another text area.

<!DOCTYPE html> <head> <title>Substitution Cipher</title> <script type="text/javascript"> function Encode(key, message) // Given : key is a string of the 26 letters in arbitrary order, // message is the string to be encoded using the key // Returns: the coded version of message using the substitution key { var alphabet, coded, i, ch, index; alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; coded = ""; for (i = 0; i < message.length; i++) { // for as many letters as there are ch = message.charAt(i); // access the letter in the message index = alphabet.indexOf(ch); // find its position in alphabet if (index == -1) { // if it's not a letter, coded = coded + ch; // then leave it as is & add } // otherwise, else { // find the corresponding coded = coded + key.charAt(index); // letter in the key & add } } return coded; } </script> </head> <body> <form name="CodeForm"> <table> <tr> <td> According to the substitution cipher, each letter: </td> <td>  <kbd>ABCDEFGHIJKLMNOPQRSTUVWXYZ</kbd></td> </tr> <tr> <td> </td> <td> <kbd>↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ </kbd></td> </tr> <tr> <td> is encoded using the corresponding letter in the key:</td> <td> <input type="text" name="key" size=26 value="XYZABCDEFGHIJKLMNOPQRSTUVW" style="font-family: monospace; font-size: 1em;"></td> </tr> </table> <hr> <table> <tr> <td> <textarea name="decoded" rows=8 cols=30 wrap="virtual"></textarea> </td> <td> <input type="button" value="Encode ==>" onclick="document.CodeForm.encoded.value= Encode(document.CodeForm.key.value, document.CodeForm.decoded.value);"></td> <td> <textarea name="encoded" rows=8 cols=30 wrap="virtual"></textarea></td> </tr> </table> </form> (Use all capital letters for now!) </body> </html>

Cut-and-paste the above HTML code (or just Ctrl+click and save this file) into a document called cipher.html. Load this document into the browser and use it to test your encoding predictions from Warmup 2 above.

Are non-uppercase letters (including spaces and punctuation marks) handled by this code? What happens if you enter these characters?

Given an encoded message and the key by which it was encoded, decoding a message is a straightforward process. The steps in the encoding must simply be performed in reverse. That is, each coded letter must be mapped back into the corresponding letter of the alphabet. Consider the Atbash cipher, for example:

    Atbash cipher:                  
        ABCDEFGHIJKLMNOPQRSTUVWXYZ   
        ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓     (mapping to encode)
        ZYXWVUTSRQPONMLKJIHGFEDCBA 
        ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓     (reverse mapping to decode)
        ABCDEFGHIJKLMNOPQRSTUVWXYZ

Since the letter 'A' is encoded as 'Z', the letter 'Z' is decoded by performing the reverse mapping back to 'A'. For each letter in a coded message, this reverse mapping can be made to recover the original letter. The steps involved in decoding an encoded message can be described as follows:

    for as many letters as there are in the encoded message (starting with first letter)
        look at the next character in the encoded message
        find its position in the key
        find the corresponding letter in the alphabet
        use that letter to decode the letter in the encoded message

Note the similarities between the steps in encoding and decoding messages!

Add a function called Decode to the cipher.html document. This function should be very similar to the Encode function given to you, only it should decode instead of encode messages. It should have two input parameters, representing the key and the encoded message (just like Encode), and should return the decoded version of that message.

Add another button to the Web page labeled "Decode <==", immediately below the "Encode ==>" button. When the user clicks on that button, the contents of the text area on the right should be decoded, and the corresponding message should be written in the text area on the left.

Test your Decode function thoroughly!