# Lab 3: Building a Python Toolbox¶

Objectives

In this lab you will accomplish two tasks. First, you will construct a toolbox or module of tools for manipulating words and word lists. Then, when finished with your module, you will use your toolbox to answer some trivia questions. In doing this lab, you will gain experience with the following:

• Using sequences in Python (lists and strings), and associated operators, methods, functions;

• Writing simple and nested loops;

• Writing doctests to test your functions; and

• Creating a module in Python.

Note: You may find it useful to refer to our Strings and Lists Cheat Sheet while working on this lab.

## NPR Puzzles¶

Will Shortz is the puzzle master at National Public Radio. Each Sunday morning he challenges listeners with a puzzle to solve by the following Thursday. Typically these are challenges that test one’s vocabulary, but, as we’ll see, we can frequently compute their solutions.

Here are some interesting problems (in no particular order):

• P1. (Proposed February 11, 2018.) Name part of the human body in six letters. Add an ‘r’ and rearrange the result to name a part of the body in seven letters. What is it?

• P2. (Proposed August 16, 2020.) Think of a major city in France whose name is an anagram of a major city in Italy. What cities are they? (Note: An anagram is a word, phrase, or name formed by rearranging the letters of another.)

• P3. (Proposed September 23, 2018 by Jim Levering of San Antonio) Think of a disease in five letters. Shift each letter three spaces later in the alphabet—for example, ‘a’ would become ‘d’, ‘b’ would become ‘e’, etc. The result will be a prominent name from the Bible. Who is it?

## Spelling Bee Puzzles¶

The Spelling Bee puzzle from the New York Times is also a source of interesting word problems. These words are spelled with an alphabet (called a “hive”) of at most seven letters.

Here are some more interesting problems (again, in no particular order):

• B1. How many lowercase 7-letter isograms are in the word list `'words/dict.txt'`? (Note: An isogram is a word without any repeated letters.)

• B2. (September 22, 2020.) Suppose you have a seven letter hive, `'mixcent'`. How many 4-letter lowercase words in `'words/dict.txt'` (1) include `'m'` and (2) are spelled only using (possibly repeated) letters from the hive string?

Are you up for solving one or more of these challenges??

## Getting Started¶

Before you begin, clone this week’s repository in the usual manner.

1. Open the Terminal and `cd` into your cs134 directory:

```cd cs134
```
2. Clone your repository from https://evolene.cs.williams.edu with the following command, where you should replace `23xyz3` with your CS username.

```git clone https://evolene.cs.williams.edu/cs134-labs/23xyz3/lab03.git
```
3. Navigate to your newly created lab03 subdirectory in the Terminal:

```cd lab03
```
4. Open VS Code, go to `File` menu option, choose `Open Folder`, and navigate to your `lab03` directory and click `Open`. The `lab03` starter files will be on the left pane of VS Code.

## Part 1: wordTools Module¶

The goal of this week is to build a module of utilities, called `wordTools`, for manipulating strings and lists of words. Our hope is to help people who wish to solve puzzles like those described above.

We have given you several functions in `wordTools.py` that may be helpful to you in writing more powerful functions and answering the puzzle questions. As you investigate these functions, think about how they might be used to solve more general problems. In the following steps, you will replace the lines that say `pass # TODO:  replace with your code` with your code.

1. Start by reviewing the `wordTools.py` script, paying careful attention to the docstrings and doctests in the functions. Let’s gain some experience with doctests before writing any code.

• Notice that when you run `wordTools.py` as a script, below the `if __name__ == '__main__'` line our code calls the `testmod()` function from the `doctest` module. This method performs all of the interactive examples found in the docstrings of our functions—called doctests—and verifies they produce the correct results. We can use doctests to make sure our functions perform as expected.

• Currently, one doctest associated with the `canon()` function fails when you run `wordTools.py` as a script. The `canon()` function takes a string `word` as input and returns a “canonical” version of `word` which consists of just its letters (without punctuation marks or special characters), in lower case, in alphabetical order. For example, `canon('Mama Mia!')` is the string `'aaaimmm'`.

• Fix the doctest so that when the script `wordTools.py` is executed, the `canon` function passes its tests (you will still get errors about the `uniques()` and `readWords()` functions, just ignore those for now). Throughout this semester you should use this testing process to demonstrate that the functions you write are implemented correctly. Note that you should not modify any code in the function body of `canon()` for this step; you are only modifying the doctest.

2. Now, let’s extend the toolbox. First, complete the function, `uniques(word)` that takes as input a string `word`, and returns a string consisting of the unique characters in `word`. For example, `uniques('abracadabra')` should return `'abrcd'`. Incorporate two new doctests into the docstring associated with `uniques()` that test interesting strings.

(Hint: Use a loop that updates an accumulation variable in `uniques()`.)

3. Next, complete the function, `isIsogram(word)`, that takes as input a string `word` and returns `True` if all of the characters in `word` are unique, and `False` otherwise. Case should be ignored. For example, the strings `'Lida'`, and `'CS134'` are isograms, but `'JeaNnie'` and `'Iris'` are not. Incorporate at least two doctests into the docstring of `isIsogram()` that test other interesting strings.

(Hint: Your implementation of `isIsogram()` should call `uniques()`.)

4. Your next job is to write a function, `sized(n, wordList)`, that takes as input a word length, `n`, and a word list, `wordList`, and returns a list of the words in `wordList` that are exactly length `n`. For example:

```>>> sized(3, ['cat', 'dog', 'goat'])
['cat', 'dog']
>>> sized(5, ['frog', 'duck', 'mouse'])
['mouse']
```

Write two new doctests to help verify that your `sized()` function works as expected.

5. Note that we have given you a function called `readWords(filename)` that takes as input the path to a file `filename`, reads the file, and returns a `list` of words found one per line in a file whose name is specified by `filename`. A “word”, like `'New York'`, may include spaces internally, but not at its ends. You do not need to modify this function, but you may want to use to solve the puzzles. Spend a few minutes reviewing it. You might use this function in the following ways:

```>>> len(readWords('words/firstNames.txt'))
5166
'belly button'
['Cagliari', 'Florence', 'Siracusa']
```
6. Finally, review your `wordTools` toolkit, ensuring it is a solidly built module:

• Complete the triple-quoted docstring at the top of the file. This helps users understand the purpose of this module. You can check all your documentation with:

```pydoc3 wordTools
```

Pressing `q` will exit the pydoc viewer if it does not exit automatically.

• Make sure that every function is also documented with a helpful docstring.

• Thoroughly test each function. You might, for example, import the particular function into interactive Python and make sure it works as you expect.

• Include, in each docstring, at least two doctests (`>>>`) for each function in `wordTools`.

## Part 2: Solving Puzzles¶

We’re finally ready to solve some puzzles! We have provided you with a collection of text files containing relevant collections of words in the `words` folder of your repository that may be useful. (The `words/README.txt` file describes the contents of these word lists.)

1. Start by solving spelling-bee puzzle `B1` as described at the beginning of this handout. In particular, in the Python script `puzzles.py` provided in the starter, complete the definition of function `b1()` that returns the solution to the puzzle.

2. Next, you may solve either the NPR puzzle `P1` or `P2` as described above. You must solve at least one of these! If you want extra practice, try solving both. As above, complete the definition of the appropriate function (named after the puzzle) that returns the solution as a string consisting of the pair of answers (in any order) separated by a space. For example, if the solution to `P1` is `'stomach'` and `'cartilage'`, the function `p1()` should return the string `'stomach cartilage'` or `'cartilage stomach'`.

3. Extra Credit: If you would like a challenge, check out problems `B2` and `P3`. These are not required! A small amount of extra credit will be given if you solve one or both of them.

(Hint: You might want to write a helper function (or two) to solve P3.)

Good luck! Do not forget to add, commit, and push your work as it progresses! Test your code often to simplify debugging.

When you are finished, specify collaborators in README.md. Then add, commit, and push all of your work to evolene. This will include the completed `wordTools.py` and `puzzles.py`.

## Submit You Work¶

1. When you are finished with the lab, be sure to `add` and `commit` your work.

```git add wordTools.py puzzles.py
git commit -m "Lab 3 completed"
```

Then `push` your work (remembering to start the `VPN` if you’re working from off campus):

```git push
```
2. You can, if you wish, check that your work is up-to-date on https://evolene.cs.williams.edu, or with `git status` in the Terminal window:

```git status
```
3. Please edit the `README.md` file and enter the names of any such students on the `Collaboration` line. Commit and push this change.

4. `Gradesheet.txt` gives a breakdown of the rubric you will be graded on for this lab. When graded, this file will contain the feedback as well.

1. Your code for the puzzles must compute each answer as directly as possible. In addition, you should make use of the tools imported from your `wordTools` module whenever possible.
2. We are looking for solutions that do not use too many for loops or iterate over the word lists more than is necessary. For example, `P1` and `P2` can be solved using a nested for loop. If you find yourself writing more than 4 loops, it may be best to review your strategy with a TA or an instructor.
3. Make sure you implement the functions of `wordTools` carefully. Do not modify function names or interpret parameters differently. Make sure your functions return the results described. This document serves, in some way, as a contract between you and your users. Deviating from this contract makes it hard for potential users to adopt your implementation!
5. As always, the file `GradeSheet.txt` in your `lab03` repository goes over the grading guidelines and documents our expectations.