Dictionaries & Lists¶
Today, we will discuss dictionaries in more detail and explore the differences between unordered collections and sequence.
New Mutable Collection: Dictionaries¶
Dictionaries are unordered collections that map keys to values.
The main motivation behind dictionaries is support for efficient queries: to look for a value associated with a key, we do not need to look through all the keys. We can just access the dictionary using the key as the subscript, and the dictionary returns the corresponding values.
This makes queries a lot more efficient!
# sample dictionary
zipCodes = {'01267': 'Williamstown', '60606': 'Chicago',
'48202': 'Detroit', '97210': 'Portland'}
# what US city has this zip code?
zipCodes['60606']
'Chicago'
# what US city has this zip code?
zipCodes['48202']
'Detroit'
# if key does not exist
zipCodes['11777']
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/var/folders/md/kwd9nc_d2ns0hw9wsvdrnt2c0000gn/T/ipykernel_42692/1948765118.py in <module>
1 # if key does not exist
----> 2 zipCodes['11777']
KeyError: '11777'
zipCodes['11777'] = 'Port Jefferson'
zipCodes
{'01267': 'Williamstown',
'60606': 'Chicago',
'48202': 'Detroit',
'97210': 'Portland',
'11777': 'Port Jefferson'}
len(zipCodes)
5
'90210' in zipCodes
False
'01267' in zipCodes
True
zipCodes.values()
dict_values(['Williamstown', 'Chicago', 'Detroit', 'Portland', 'Port Jefferson'])
Creating Dictionaries¶
Dictionaries can be created in many ways:
Direct assignment
Starting with an empty dictionary and accumulating key-value paris
Using the
dict()
function
# direct assignment
scrabbleScore = {'a':1 , 'b':3, 'c':3, 'd':2, 'e':1,
'f':4, 'g':2, 'h':4, 'i':1, 'j':8,
'k':5, 'l':1, 'm':3, 'n':1, 'o':1,
'p':3, 'q':10, 'r':1, 's':1, 't':1,
'u':1, 'v':8, 'w':4, 'x':8, 'y':4, 'z': 10}
# accumulate in a dictionary
verse = "let it be,let it be,let it be,let it be,there will be an answer,let it be"
counts = {} # empty dictionary
for line in verse.split(','):
if line not in counts:
counts[line] = 1 # initialize count
else:
counts[line] += 1 # update count
counts
{'let it be': 5, 'there will be an answer': 1}
# use dict() function
dict([('a', 5), ('b', 7), ('c', 10)])
{'a': 5, 'b': 7, 'c': 10}
Important Note: Dictionaries are unordered. In Python 3.6 and beyond, the keys and values of a dictionary are displayed and iterated over in the same order in which they were created. However, this behavior may vary across different Python versions, and it depends on the dictionary’s history of insertions and deletions.
Iterating over keys of a dictionary¶
We can iterate directly over the keys of a dictionary. This is the preferred way to access its items.
calendar = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31}
for month in calendar:
print(month, calendar[month], end=",")
Jan 31,Feb 28,Mar 31,Apr 30,May 31,Jun 30,Jul 31,Aug 31,Sep 30,Oct 31,Nov 30,Dec 31,
Example: frequency
¶
Let’s write a function frequency
that takes as input a list of words wordList
and returns a dictionary freqDict
with the unique words in wordList
as keys, and their number of occurrences in wordList
as values.
def frequency(wordList):
"""Given a list of words, returns a dictionary of word frequencies"""
freqDict = {} # initialize accumulator as empty dict
for word in wordList:
# check whether word not in dictionary
if word not in freqDict:
freqDict[word] = 1
else:
freqDict[word] += 1
return freqDict
frequency(['a', 'a', 'a', 'c', 'b', 'a', 'd'])
{'a': 4, 'c': 1, 'b': 1, 'd': 1}
verseWords = ['let','it','be','let','it','be','there','will','be','an','answer']
frequency(verseWords)
{'let': 2, 'it': 2, 'be': 3, 'there': 1, 'will': 1, 'an': 1, 'answer': 1}
Important Dictionary Method: .get()
¶
get()
method is an alternative to using subscript to get the value associated with a key in a dictionary without checking for its existenceIt takes two arguments: a key, and an optional default value to use if the key is not in the dictionary
It returns the value associated with the given key, and if key does not exist it returns the default value (if given), otherwise returns None.
Syntax: val = myDict.get(aKey, defaultVal)
ids = {'rb17': 'Rohit', 'jra1': 'Jeannie',
'sfreund': 'Steve', 'lpd2': 'Lida'}
ids.get('lpd2')
'Lida'
print(ids.get('ss32'))
None
ids # .get does not change the dictionary
{'rb17': 'Rohit', 'jra1': 'Jeannie', 'sfreund': 'Steve', 'lpd2': 'Lida'}
print(ids.get('ksl23'))
None
Rewrite frequency
using get
¶
def frequencyOld(wordList):
"""Given a list of words, returns a dictionary of word frequencies"""
freqDict = {} # initialize accumulator as empty dict
for word in wordList:
if word not in freqDict:
freqDict[word] = 1 # add key with count 1
else:
freqDict[word] += 1 # update count
return freqDict
def frequency(wordList):
"""Given a list of words, returns a dictionary of word frequencies"""
freqDict = {} # initialize accumulator as empty dict
for word in wordList:
# what should we write instead?
print(word)
freqDict[word] = freqDict.get(word, 0) + 1
return freqDict
frequency(['a', 'a', 'a', 'c', 'b', 'a', 'd'])
a
a
a
c
b
a
d
{'a': 4, 'c': 1, 'b': 1, 'd': 1}
Dictionary Methods: keys()
, values()
, items
¶
Sometimes we are interested in knowing the keys
, values
or items
(key, value pairs) of a dictionary.
Each of these methods returns an object containing only the keys, values, and items, respectively.
calendar = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31}
calendar.keys()
dict_keys(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
calendar.values()
dict_values([31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
calendar.items()
dict_items([('Jan', 31), ('Feb', 28), ('Mar', 31), ('Apr', 30), ('May', 31), ('Jun', 30), ('Jul', 31), ('Aug', 31), ('Sep', 30), ('Oct', 31), ('Nov', 30), ('Dec', 31)])
To iterate over the keys and values, we can use the items()
method and this for loop syntax to iterate over the tuples of key-value pairs.
for mon,days in calendar.items():
print(mon, days, end=" ")
Jan 31 Feb 28 Mar 31 Apr 30 May 31 Jun 30 Jul 31 Aug 31 Sep 30 Oct 31 Nov 30 Dec 31
Lists as Dict Value and Aliasing¶
Dictionary keys are immutable: cannot have keys of mutable types such as list
Dictionary values can be any type (mutable values such as lists)
This has aliasing implications.
hpDict = {'hp2': ['Harry Potter', 11, 'Grynfindor'],
'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
'ss3': ['Severus Snape', 60, 'Slytherin']}
newDict = hpDict # alias
newDict['cc10'] = ['Cho Change', 13, 'Ravenclaw']
hpDict # changes
{'hp2': ['Harry Potter', 11, 'Grynfindor'],
'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
'ss3': ['Severus Snape', 60, 'Slytherin'],
'cc10': ['Cho Change', 13, 'Ravenclaw']}
student = hpDict['hp2']
student[1] = 14
hpDict
{'hp2': ['Harry Potter', 14, 'Grynfindor'],
'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
'ss3': ['Severus Snape', 60, 'Slytherin'],
'cc10': ['Cho Change', 13, 'Ravenclaw']}
newDict
{'hp2': ['Harry Potter', 14, 'Grynfindor'],
'ad1': ['Albus Dumbledore', 112, 'Gryffindor'],
'ss3': ['Severus Snape', 60, 'Slytherin'],
'cc10': ['Cho Change', 13, 'Ravenclaw']}
Dictionary Comprehensions¶
Similar to list comphrehensions!
calendar = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31}
days30 = {k: calendar[k] for k in calendar if k[0] == 'J'}
days30
{'Jan': 31, 'Jun': 30, 'Jul': 31}
Sorting Operations with Dictionaries¶
Let’s say we have a dictionary corresponding to key-value pairs of letters and their associated score in the board game Scrabble.
scrabbleScore = {'a':1 , 'b':3, 'c':3, 'd':2, 'e':1,
'f':4, 'g':2, 'h':4, 'i':1, 'j':8,
'k':5, 'l':1, 'm':3, 'n':1, 'o':1,
'p':3, 'q':10, 'r':1, 's':1, 't':1,
'u':1, 'v':8, 'w':4, 'x':8, 'y':4, 'z': 10}
By default, calling the sorted function on a dictionary will return a sorted list of keys.
print(sorted(scrabbleScore))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
But the above sorting behavior isn’t super interesting in this scrabble example maybe we’d like to get an ordering based on the values (scores) of the letter instead. We can use ideas we’ve learned regarding key
functions and tuples
to help us!
def getScrabbleScore(letterScoreTuple):
"""
Takes a tuple corresponding to (letter, score) and returns the score
"""
return letterScoreTuple[1]
# first use the items method to get a list of (key, value) tuples
# and then sort using a key function
scrabbleItems = scrabbleScore.items()
sortedScrabbleItems = sorted(scrabbleItems, key=getScrabbleScore, reverse=True)
print(sortedScrabbleItems[0:3], '...', sortedScrabbleItems[-3:])
[('q', 10), ('z', 10), ('j', 8)] ... [('s', 1), ('t', 1), ('u', 1)]
We can further use a list comprehension to just get the letters from these tuples. Exercise: What would that look like?
Advantages of Storing Unordered Data as A Dictionary¶
So what’s the big deal about dictionaries? Let’s examine the benefit of using the dictionary for storing Scrabble scores as opposed to using a list of tuples or two separate lists.
# random letters to query several times
import time
randomLetters = ['a', 'l', 'q', 's', 'y', 'z']*1000000
print("Number of queries", len(randomLetters))
Number of queries 6000000
# generate list of letters and scores
letters = list(scrabbleScore.keys())
scores = list(scrabbleScore.values())
# time using list operations to compute total score
startTime = time.time()
totalScore = 0
for query in randomLetters:
index = letters.index(query)
totalScore += scores[index]
endTime = time.time()
timeList = endTime - startTime
print("Time taken using a list", round(timeList, 3), "seconds")
Time taken using a list 2.54 seconds
# time using dictionaries to compute total score
startTime = time.time()
totalScore = 0
for query in randomLetters:
totalScore += scrabbleScore[query]
endTime = time.time()
timeDict = endTime - startTime
print("Time taken using a dictionary", round(timeDict, 3), "seconds")
Time taken using a dictionary 0.618 seconds
Even in this simple example dictionaries offer a 4x speed-up!