Lab 3: Building a Python Toolbox¶
Objectives
In this lab you will accomplish two tasks. First, you will construct a toolbox or module of tools for manipulating words and word lists. Then, when finished with your module, you will use your toolbox to answer some trivia questions. In doing this lab, you will gain experience with the following:
Using sequences in Python (lists and strings), and associated operators and methods;
Writing simple and nested loops;
Writing doctests to test your functions; and
Creating a module in Python.
Note: You may find it useful to refer to our Strings and Lists Cheat Sheet while working on this lab.
NPR Puzzles¶
Will Shortz is the puzzle master at National Public Radio. Each Sunday morning he challenges listeners with a puzzle to solve by the following Thursday. Typically these are challenges that test one’s vocabulary, but, as we’ll see, we can frequently compute their solutions.
Here are some interesting problems (in no particular order):
P1. (Proposed February 11, 2018.) Name part of the human body in six letters. Add an ‘r’ and rearrange the result to name a part of the body in seven letters. What is it?
P2. (Proposed August 16, 2020.) Think of a major city in France whose name is an anagram of a major city in Italy. What cities are they? (Note: An anagram is a word, phrase, or name formed by rearranging the letters of another.)
P3. (Proposed September 23, 2018 by Jim Levering of San Antonio) Think of a disease in five letters. Shift each letter three spaces later in the alphabet—for example, ‘a’ would become ‘d’, ‘b’ would become ‘e’, etc. The result will be a prominent name from the Bible. Who is it?
Spelling Bee Puzzles¶
The Spelling Bee puzzle from the New York Times is also a source of interesting word problems. These words are spelled with an alphabet (called a “hive”) of at most seven letters.
Here are some more interesting problems (again, in no particular order):
B1. How many lowercase 7-letter isograms are in the word list
'words/dict.txt'
? (Note: An isogram is a word without any repeated letters.)B2. (September 22, 2020.) Suppose you have a seven letter hive,
'mixcent'
. How many 4-letter lowercase words in'words/dict.txt'
(1) include'm'
and (2) are spelled only using (possibly repeated) letters from the hive string?
Are you up for solving one or more of these challenges??
Getting Started¶
Before you begin, clone this week’s repository in the usual manner.
Open the Terminal and
cd
into your cs134 directory:cd cs134
Clone your repository from https://evolene.cs.williams.edu with the following command, where you should replace
22xyz3
with your CS username.git clone https://evolene.cs.williams.edu/cs134-labs/22xyz3/lab03.git
Navigate to your newly created lab03 subdirectory in the Terminal:
cd lab03
Open Atom, go to
File
menu option, chooseAdd Project Folder
, and navigate to yourlab03
directory and clickOpen
. Thelab03
starter files will be on the left pane of Atom.
Part 1: wordTools Module¶
The goal of this week is to build a module of utilities, called wordTools
, for manipulating strings and lists of words. Our hope is to help people who wish to solve puzzles like those described above.
We have given you several functions in wordTools.py
that may be helpful to you in writing more powerful functions and answering the puzzle questions. As you investigate these functions, think about how they might be used to solve more general problems. In the following steps, you will replace the lines that say pass # TODO: replace with your code
with your code.
Start by reviewing the
wordTools.py
script, paying careful attention to the docstrings and doctests in the functions. Let’s gain some experience with doctests before writing any code.Notice that when you run
wordTools.py
as a script, below theif __name__ == '__main__'
line our code calls thetestmod()
function from thedoctest
module. This method performs all of the interactive examples found in the docstrings of our functions—called doctests—and verifies they produce the correct results. We can use doctests to make sure our functions perform as expected.Currently, one doctest associated with the
canon()
function fails when you runwordTools.py
as a script. Thecanon()
function takes a stringword
as input and returns a “canonical” version ofword
which consists of just its letters (without punctuation marks or special characters), in lower case, in alphabetical order. For example,canon('Mama Mia!')
is the string'aaaimmm'
.Fix the doctest so that when the script
wordTools.py
is executed, thecanon
function passes its tests (you will still get errors about theuniques()
andreadWords()
functions, just ignore those for now). Throughout this semester you will be required to use this testing process to demonstrate that the functions you write are implemented correctly. Note that you should not modify any code in the function body ofcanon()
for this step; you are only modifying the doctest.
Now, let’s extend the toolbox. First, complete the function,
uniques(word)
that takes as input a stringword
, and returns a string consisting of the unique characters inword
. For example,uniques('abracadabra')
should return'abrcd'
. Incorporate two new doctests into the docstring associated withuniques()
that test interesting strings.(Hint: Use a loop that updates an accumulation variable in
uniques()
.)Next, complete the function,
isIsogram(word)
, that takes as input a stringword
and returnsTrue
if all of the characters inword
are unique, andFalse
otherwise. Case should be ignored. For example, the strings'Rohit'
,'Lida'
, and'CS134'
are isograms, but'Jeannie'
and'StEve'
are not. Incorporate at least two doctests into the docstring ofisIsogram()
that test other interesting strings.(Hint: Your implementation of
isIsogram()
should calluniques()
.)Your next job is to write a function,
sized(n, wordList)
, that takes as input a word length,n
, and a word list,wordList
, and returns a list of the words inwordList
that are exactly lengthn
. For example:>>> sized(3, ['cat', 'dog', 'goat']) ['cat', 'dog'] >>> sized(5, ['frog', 'duck', 'mouse']) ['mouse']
Write two new doctests to help verify that your
sized()
function works as expected.Note that we have given you a function called
readWords(filename)
that takes as input the path to a filefilename
, reads the file, and returns alist
of words found one per line in a file whose name is specified byfilename
. A “word”, like'New York'
, may include spaces internally, but not at its ends. You do not need to modify this function, but you may want to use to solve the puzzles. Spend a few minutes reviewing it. You might use this function in the following ways:>>> len(readWords('words/firstNames.txt')) 5166 >>> readWords('words/bodyParts.txt')[14] 'belly button' >>> sized(8, readWords('words/italianCities.txt')) ['Cagliari', 'Florence', 'Siracusa']
Finally, review your
wordTools
toolkit, ensuring it is a solidly built module:Complete the triple-quoted docstring at the top of the file. This helps users understand the purpose of this module. You can check all your documentation with:
pydoc3 wordTools
Pressing
q
will exit the pydoc viewer if it does not exit automatically.Make sure that every function is also documented with a helpful docstring.
Thoroughly test each function. You might, for example, import the particular function into interactive Python and make sure it works as you expect.
Include, in each docstring, at least two doctests (
>>>
) for each function inwordTools
.
Part 2: Solving Puzzles¶
We’re finally ready to solve some puzzles! We have provided you with a collection of text files containing relevant collections of words in the words
folder of your repository that may be useful. (The words/README.txt
file describes the contents of these word lists.)
Start by solving spelling-bee puzzle
B1
as described at the beginning of this handout. In particular, in the Python scriptpuzzles.py
provided in the starter, complete the definition of functionb1()
that returns the solution to the puzzle.Next, you may solve either the NPR puzzle
P1
orP2
as described above. You must solve at least one of these! If you want extra practice, try solving both. As above, complete the definition of the appropriate function (named after the puzzle) that returns the solution as a string consisting of the pair of answers (in any order) separated by a space. For example, if the solution toP1
is'stomach'
and'cartilage'
, the functionp1()
should return the string'stomach cartilage'
or'cartilage stomach'
.Extra Credit: If you would like a challenge, check out problems
B2
andP3
. These are not required! A small amount of extra credit will be given if you solve one or both of them.(Hint: You might want to write a helper function (or two) to solve P3.)
Good luck! Do not forget to add, commit, and push your work as it progresses! Test your code often to simplify debugging.
When you are finished, specify collaborators in README.md. Then add, commit, and push all of your work to evolene. This will include the completed wordTools.py
and puzzles.py
.
Submit You Work¶
When you are finished with the lab, be sure to
add
andcommit
your work.git add wordTools.py puzzles.py git commit -m "Lab 3 completed"
Then
push
your work (remembering to start theVPN
if you’re working from off campus):git push
You can, if you wish, check that your work is up-to-date on https://evolene.cs.williams.edu, or with
git status
in the Terminal window:git status
Please edit the
README.md
file and enter the names of any such students on theCollaboration
line. Commit and push this change.Gradesheet.txt
gives a breakdown of the rubric you will be graded on for this lab. When graded, this file will contain the feedback as well.
Grading Notes¶
Your code for the puzzles must compute each answer as directly as possible. In addition, you should make use of the tools imported from your
wordTools
module whenever possible.We are looking for solutions that do not use too many for loops or iterate over the word lists more than is necessary. For example,
P1
andP2
can be solved using a nested for loop. If you find yourself writing more than 4 loops, it may be best to review your strategy with a TA or an instructor.Make sure you implement the functions of
wordTools
carefully. Do not modify function names or interpret parameters differently. Make sure your functions return the results described. This document serves, in some way, as a contract between you and your users. Deviating from this contract makes it hard for potential users to adopt your implementation!Functionality and programming style are important, just as both the content and the writing style are important when writing an essay. Make sure your variables are named well, and your use of comments, white space, and line breaks promote readability. We expect to see code that makes your logic as clear and easy to follow as possible. The Python Style Guide is available on the course website to help you with stylistic decisions.
As always, the file
GradeSheet.txt
in yourlab03
repository goes over the grading guidelines and documents our expectations.