Dictionaries & Lists

Today, we will discuss dictionaries in more detail and explore the differences between unordered collections and sequence.

New Mutable Collection: Dictionaries

Dictionaries are unordered collections that map keys to values.

The main motivation behind dictionaries is support for efficient queries: to look for a value associated with a key, we do not need to look through all the keys. We can just access the dictionary using the key as the subscript, and the dictionary returns the corresponding values.

This makes queries a lot more efficient!

# sample dictionary
zipCodes = {'01267': 'Williamstown', '60606': 'Chicago', 
            '48202': 'Detroit', '97210': 'Portland'}
# what US city has this zip code?
zipCodes['60606'] 
'Chicago'
# what US city has this zip code?
zipCodes['48202']
'Detroit'
# if key does not exist
zipCodes['11777']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/var/folders/md/kwd9nc_d2ns0hw9wsvdrnt2c0000gn/T/ipykernel_42692/1948765118.py in <module>
      1 # if key does not exist
----> 2 zipCodes['11777']

KeyError: '11777'
zipCodes['11777'] = 'Port Jefferson'
zipCodes
{'01267': 'Williamstown',
 '60606': 'Chicago',
 '48202': 'Detroit',
 '97210': 'Portland',
 '11777': 'Port Jefferson'}
len(zipCodes)
5
'90210' in zipCodes
False
'01267' in zipCodes
True
zipCodes.values()
dict_values(['Williamstown', 'Chicago', 'Detroit', 'Portland', 'Port Jefferson'])

Creating Dictionaries

Dictionaries can be created in many ways:

  • Direct assignment

  • Starting with an empty dictionary and accumulating key-value paris

  • Using the dict() function

# direct assignment
scrabbleScore = {'a':1 , 'b':3, 'c':3, 'd':2, 'e':1, 
                 'f':4, 'g':2, 'h':4, 'i':1, 'j':8, 
                 'k':5, 'l':1, 'm':3, 'n':1, 'o':1, 
                 'p':3, 'q':10, 'r':1, 's':1, 't':1, 
                 'u':1, 'v':8, 'w':4, 'x':8, 'y':4, 'z': 10} 
# accumulate in a dictionary
verse = "let it be,let it be,let it be,let it be,there will be an answer,let it be"
counts = {} # empty dictionary
for line in verse.split(','):
    if line not in counts:
        counts[line] = 1 # initialize count
    else:
        counts[line] += 1 # update count
counts
{'let it be': 5, 'there will be an answer': 1}
# use dict() function
dict([('a', 5), ('b', 7), ('c', 10)])
{'a': 5, 'b': 7, 'c': 10}

Important Note: Dictionaries are unordered. In Python 3.6 and beyond, the keys and values of a dictionary are displayed and iterated over in the same order in which they were created. However, this behavior may vary across different Python versions, and it depends on the dictionary’s history of insertions and deletions.

Iterating over keys of a dictionary

We can iterate directly over the keys of a dictionary. This is the preferred way to access its items.

calendar = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
            'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
            'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31} 

for month in calendar:
    print(month, calendar[month], end=",")
Jan 31,Feb 28,Mar 31,Apr 30,May 31,Jun 30,Jul 31,Aug 31,Sep 30,Oct 31,Nov 30,Dec 31,

Example: frequency

Let’s write a function frequency that takes as input a list of words wordList and returns a dictionary freqDict with the unique words in wordList as keys, and their number of occurrences in wordList as values.

def frequency(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    freqDict = {} # initialize accumulator as empty dict
    for word in wordList:
        # check whether word not in dictionary
        if word not in freqDict:
            freqDict[word] = 1
        else:
            freqDict[word] += 1
    return freqDict
frequency(['a', 'a', 'a', 'c', 'b', 'a', 'd'])
{'a': 4, 'c': 1, 'b': 1, 'd': 1}
verseWords = ['let','it','be','let','it','be','there','will','be','an','answer']
frequency(verseWords)
{'let': 2, 'it': 2, 'be': 3, 'there': 1, 'will': 1, 'an': 1, 'answer': 1}

Important Dictionary Method: .get()

  • get() method is an alternative to using subscript to get the value associated with a key in a dictionary without checking for its existence

  • It takes two arguments: a key, and an optional default value to use if the key is not in the dictionary

  • It returns the value associated with the given key, and if key does not exist it returns the default value (if given), otherwise returns None.

Syntax:   val = myDict.get(aKey, defaultVal)

ids = {'rb17': 'Rohit', 'jra1': 'Jeannie', 
       'sfreund': 'Steve', 'lpd2': 'Lida'}
ids.get('lpd2')
'Lida'
print(ids.get('ss32'))
None
ids # .get does not change the dictionary
{'rb17': 'Rohit', 'jra1': 'Jeannie', 'sfreund': 'Steve', 'lpd2': 'Lida'}
print(ids.get('ksl23'))
None

Rewrite frequency using get

def frequencyOld(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    freqDict = {} # initialize accumulator as empty dict
    for word in wordList:
        if word not in freqDict:
            freqDict[word] = 1 # add key with count 1
        else:
            freqDict[word] += 1 # update count
    return freqDict
def frequency(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    freqDict = {} # initialize accumulator as empty dict
    for word in wordList:
        # what should we write instead?
        print(word)
        freqDict[word] = freqDict.get(word, 0) + 1
    return freqDict
frequency(['a', 'a', 'a', 'c', 'b', 'a', 'd'])
a
a
a
c
b
a
d
{'a': 4, 'c': 1, 'b': 1, 'd': 1}

Dictionary Methods: keys(), values(), items

Sometimes we are interested in knowing the keys, values or items (key, value pairs) of a dictionary. Each of these methods returns an object containing only the keys, values, and items, respectively.

calendar = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
            'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
            'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31} 
calendar.keys()
dict_keys(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
calendar.values()
dict_values([31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
calendar.items()
dict_items([('Jan', 31), ('Feb', 28), ('Mar', 31), ('Apr', 30), ('May', 31), ('Jun', 30), ('Jul', 31), ('Aug', 31), ('Sep', 30), ('Oct', 31), ('Nov', 30), ('Dec', 31)])

To iterate over the keys and values, we can use the items() method and this for loop syntax to iterate over the tuples of key-value pairs.

for mon,days in calendar.items():
    print(mon, days, end=" ")
Jan 31 Feb 28 Mar 31 Apr 30 May 31 Jun 30 Jul 31 Aug 31 Sep 30 Oct 31 Nov 30 Dec 31 

Lists as Dict Value and Aliasing

  • Dictionary keys are immutable: cannot have keys of mutable types such as list

  • Dictionary values can be any type (mutable values such as lists)

This has aliasing implications.

hpDict = {'hp2': ['Harry Potter', 11, 'Grynfindor'],
          'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
          'ss3': ['Severus Snape', 60, 'Slytherin']}
newDict = hpDict # alias
newDict['cc10'] = ['Cho Change', 13, 'Ravenclaw']
hpDict # changes
{'hp2': ['Harry Potter', 11, 'Grynfindor'],
 'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
 'ss3': ['Severus Snape', 60, 'Slytherin'],
 'cc10': ['Cho Change', 13, 'Ravenclaw']}
student = hpDict['hp2']
student[1] = 14
hpDict
{'hp2': ['Harry Potter', 14, 'Grynfindor'],
 'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
 'ss3': ['Severus Snape', 60, 'Slytherin'],
 'cc10': ['Cho Change', 13, 'Ravenclaw']}
newDict
{'hp2': ['Harry Potter', 14, 'Grynfindor'],
 'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
 'ss3': ['Severus Snape', 60, 'Slytherin'],
 'cc10': ['Cho Change', 13, 'Ravenclaw']}

Dictionary Comprehensions

Similar to list comphrehensions!

calendar = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
            'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
            'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31} 
days30 = {k: calendar[k] for k in calendar if k[0] == 'J'}
days30
{'Jan': 31, 'Jun': 30, 'Jul': 31}

Sorting Operations with Dictionaries

Let’s say we have a dictionary corresponding to key-value pairs of letters and their associated score in the board game Scrabble.

scrabbleScore = {'a':1 , 'b':3, 'c':3, 'd':2, 'e':1, 
                 'f':4, 'g':2, 'h':4, 'i':1, 'j':8, 
                 'k':5, 'l':1, 'm':3, 'n':1, 'o':1, 
                 'p':3, 'q':10, 'r':1, 's':1, 't':1, 
                 'u':1, 'v':8, 'w':4, 'x':8, 'y':4, 'z': 10} 

By default, calling the sorted function on a dictionary will return a sorted list of keys.

print(sorted(scrabbleScore))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

But the above sorting behavior isn’t super interesting in this scrabble example maybe we’d like to get an ordering based on the values (scores) of the letter instead. We can use ideas we’ve learned regarding key functions and tuples to help us!

def getScrabbleScore(letterScoreTuple):
    """
    Takes a tuple corresponding to (letter, score) and returns the score
    """
    return letterScoreTuple[1]


# first use the items method to get a list of (key, value) tuples
# and then sort using a key function
scrabbleItems = scrabbleScore.items()
sortedScrabbleItems = sorted(scrabbleItems, key=getScrabbleScore, reverse=True)
print(sortedScrabbleItems[0:3], '...', sortedScrabbleItems[-3:])
[('q', 10), ('z', 10), ('j', 8)] ... [('s', 1), ('t', 1), ('u', 1)]

We can further use a list comprehension to just get the letters from these tuples. Exercise: What would that look like?

Advantages of Storing Unordered Data as A Dictionary

So what’s the big deal about dictionaries? Let’s examine the benefit of using the dictionary for storing Scrabble scores as opposed to using a list of tuples or two separate lists.

# random letters to query several times
import time
randomLetters = ['a', 'l', 'q', 's', 'y', 'z']*1000000
print("Number of queries", len(randomLetters))
Number of queries 6000000
# generate list of letters and scores
letters = list(scrabbleScore.keys())
scores = list(scrabbleScore.values())

# time using list operations to compute total score
startTime = time.time()
totalScore = 0

for query in randomLetters:
    index = letters.index(query)
    totalScore += scores[index]

endTime = time.time()
timeList = endTime - startTime
print("Time taken using a list", round(timeList, 3), "seconds")
Time taken using a list 2.54 seconds
# time using dictionaries to compute total score
startTime = time.time()
totalScore = 0

for query in randomLetters: 
    totalScore += scrabbleScore[query]

endTime = time.time()
timeDict = endTime - startTime
print("Time taken using a dictionary", round(timeDict, 3), "seconds")
Time taken using a dictionary 0.618 seconds

Even in this simple example dictionaries offer a 4x speed-up!