Nested Lists and Writing to Files¶
In the last lecture, we introduced file reading using the with ... as
block. Today, we will focus on reading CSV files, storing the data as list of lists, and analyzing it using list and string methods to compute important properties. We’ll also briefly look at writing output to files.
We have written a few helper functions for working with strings, which are now in a module sequenceTools
:
isVowel(character)
: returns True or False if character is a vowelcountVowels(word)
: returns number of vowels in word (int)wordStartEnd(wordList)
: Takes a list of words and returns the list of words in it that start and end with the same letterpalindromes(wordList)
: Takes a list of words and returns the list of words in it that are palindromes
We will use them as we work through the some examples.
from sequenceTools import *
# with a list comprehension!
filename = 'csv/classnames.csv'
with open(filename) as roster:
allStudents = [student.strip().split(',') for student in roster]
allStudents # list of lists of strings
[['Aleman-Valencia', 'Karla', '25', 'ka14'],
['Batsaikhan', 'Munguldei', '25', 'mb34'],
['Berger', 'Marcello W.', '25', 'mwb3'],
['Bertolet', 'Jeremy S.', '24', 'jsb7'],
['Bhaskar', 'Monika A.', '25', 'mab13'],
['Blair', 'Maycie C.', '25', 'mcb12'],
['Brown', 'Courtney A.', '22', 'cab10'],
['Christ', 'Alexander M.', '22', 'amc11'],
['Gonzalez', 'Gabriela M.', '24', 'gmg7'],
['Herman', 'Adelaide A.', '25', 'aah6'],
['Hu', 'Jess', '25', 'jhh3'],
['Huang', 'Will', '24', 'wh4'],
['Jain', 'Divij', '25', 'dj4'],
['Kirtane', 'Jahnavi N.', '24', 'jnk1'],
['Kluev', 'Varya A.', '25', 'vak1'],
['Klugman', 'Pat T.', '25', 'ptk2'],
['Knight Garcia I', 'Grace P.', '24', 'gpk1'],
['Kolean', 'Owen A.', '25', 'oak2'],
['Lee', 'Chan', '24', 'cjl5'],
['Liang', 'Nathan S.', '25', 'nsl3'],
['Liu', 'Karen', '25', 'kl14'],
['Loftus', 'Andrew W.', '25', 'awl5'],
['Louchheim', 'Carter H.', '25', 'chl2'],
['Lurbur', 'Hadassah N.', '24', 'hnl2'],
['Magid', 'Sam P.', '25', 'sm39'],
['Miller', 'Jakin J.', '24', 'jjm5'],
['Moon', 'Chisang', '25', 'cm33'],
["O'Connor", 'Dan P.', '25', 'dpo2'],
['Olsen', 'Will T.', '25', 'wto2'],
['Paguada', 'Brandon', '23', 'bp9'],
['Park', 'Abraham S.', '22', 'asp8'],
['Polanco', 'Isabella G.', '25', 'igp1'],
['Poll', 'Noah D.', '24', 'ndp2'],
['Ratcliffe', 'Christopher K.', '24', 'ckr1'],
['Singh', 'Jaskaran', '25', 'js34'],
['Smith', 'Tyler C.', '24', 'tcs3'],
['Sturdevant', 'Zach S.', '25', 'zss1'],
['Tantum', 'Charlie J.', '25', 'cjt3'],
['Verkleeren', 'Sophia A.', '25', 'sav3'],
['Villanueva Astilleros', 'Paola C.', '25', 'pcv1'],
['Zhang', 'Alison Y.', '24', 'ayz2'],
['Anderson', 'Carter R.', '25', 'cra3'],
['Berger', 'Elissa J.', '25', 'ejb5'],
['Brissett', 'Keel M.', '25', 'kmb9'],
['Bruce', 'Giulianna', '25', 'gb12'],
['Bruns', 'Josh A.', '25', 'jab17'],
['Burger-Moore', 'Bailey C.', '23', 'bcb3'],
['Cantin', 'Claudia V.', '23', 'cvc2'],
['Cazabal', 'Victor M.', '25', 'vmc3'],
['Cecchi-Rivas', 'Fior D.', '25', 'fdc1'],
['Cook', 'Major C.', '24', 'mcc8'],
['Damra', 'Tala H.', '25', 'thd2'],
['Dawson', 'Quinn N.', '25', 'qnd1'],
['Dhingra', 'Ronak A.', '25', 'rad6'],
['Foisy', 'Sylvain J.', '24', 'sjf3'],
['Freund', 'Avery G.', '25', 'agf1'],
['Galizio', 'Riley S.', '24', 'rsg2'],
['Garcia', 'Kaiser A.', '23', 'kag6'],
['Gustafson', 'Annie H.', '24', 'ahg2'],
['Hall', 'Oliver E.', '23', 'oeh1'],
['Huang', 'Spencer B.', '24', 'sbh1'],
['Khishigsuren', 'Marla', '23', 'mk22'],
['Kovalski', 'Lola G.', '25', 'lgk1'],
['Laesch', 'Greta M.', '25', 'gml2'],
['Loyd', 'Eddie G.', '25', 'egl2'],
['Miotto', 'Joe', '24', 'jdm9'],
['Murray', 'James D.', '25', 'jdm10'],
['Nakato', 'Rika', '24', 'rn6'],
['Pandey', 'Himal R.', '25', 'hrp3'],
['Payel', 'Maruf', '25', 'mp19'],
['Pedlow', 'Harry J.', '25', 'hjp2'],
['Pujara', 'Shivam', '25', 'smp6'],
['Resch', 'Prairie C.', '25', 'pcr1'],
['Schumann', 'Jesse H.', '25', 'jhs2'],
['Vaccaro', 'William', '25', 'wav1'],
['Van Der Weide', 'Sebastian X.', '23', 'sxv1'],
['Wongibe', 'Bernard V.', '25', 'bvw1']]
size = len(allStudents) # number of students in class
size
77
List Comprehension Exercises¶
Let’s complete the following tasks using list comprehensions.
Generate a list of only student last names.
# generate list of only student last names
lastNames = [s[0] for s in allStudents]
lastNames
['Aleman-Valencia',
'Batsaikhan',
'Berger',
'Bertolet',
'Bhaskar',
'Blair',
'Brown',
'Christ',
'Gonzalez',
'Herman',
'Hu',
'Huang',
'Jain',
'Kirtane',
'Kluev',
'Klugman',
'Knight Garcia I',
'Kolean',
'Lee',
'Liang',
'Liu',
'Loftus',
'Louchheim',
'Lurbur',
'Magid',
'Miller',
'Moon',
"O'Connor",
'Olsen',
'Paguada',
'Park',
'Polanco',
'Poll',
'Ratcliffe',
'Singh',
'Smith',
'Sturdevant',
'Tantum',
'Verkleeren',
'Villanueva Astilleros',
'Zhang',
'Anderson',
'Berger',
'Brissett',
'Bruce',
'Bruns',
'Burger-Moore',
'Cantin',
'Cazabal',
'Cecchi-Rivas',
'Cook',
'Damra',
'Dawson',
'Dhingra',
'Foisy',
'Freund',
'Galizio',
'Garcia',
'Gustafson',
'Hall',
'Huang',
'Khishigsuren',
'Kovalski',
'Laesch',
'Loyd',
'Miotto',
'Murray',
'Nakato',
'Pandey',
'Payel',
'Pedlow',
'Pujara',
'Resch',
'Schumann',
'Vaccaro',
'Van Der Weide',
'Wongibe']
Generate a list of only student first names.
# List comprehension to generate a list of first names
# (without middle initial)
firstNames = [s[1].split()[0] for s in allStudents]
firstNames
['Karla',
'Munguldei',
'Marcello',
'Jeremy',
'Monika',
'Maycie',
'Courtney',
'Alexander',
'Gabriela',
'Adelaide',
'Jess',
'Will',
'Divij',
'Jahnavi',
'Varya',
'Pat',
'Grace',
'Owen',
'Chan',
'Nathan',
'Karen',
'Andrew',
'Carter',
'Hadassah',
'Sam',
'Jakin',
'Chisang',
'Dan',
'Will',
'Brandon',
'Abraham',
'Isabella',
'Noah',
'Christopher',
'Jaskaran',
'Tyler',
'Zach',
'Charlie',
'Sophia',
'Paola',
'Alison',
'Carter',
'Elissa',
'Keel',
'Giulianna',
'Josh',
'Bailey',
'Claudia',
'Victor',
'Fior',
'Major',
'Tala',
'Quinn',
'Ronak',
'Sylvain',
'Avery',
'Riley',
'Kaiser',
'Annie',
'Oliver',
'Spencer',
'Marla',
'Lola',
'Greta',
'Eddie',
'Joe',
'James',
'Rika',
'Himal',
'Maruf',
'Harry',
'Shivam',
'Prairie',
'Jesse',
'William',
'Sebastian',
'Bernard']
Student Fun Facts!¶
Let’s use our student list to get some practice with lists of lists and useful string and list methods.
Write a function
characterList
which takes in two argumentsrosterList
(list of lists) andcharacter
(a string) and returns the list of students in the class whose name starts with character.
def characterList(rosterList, character):
"""Takes the student info as a list of lists and a
string character and returns a list of students whose
first name starts with character"""
return [name[1] for name in rosterList if name[1][0] == character]
characterList(allStudents, "B")
['Brandon', 'Bailey C.', 'Bernard V.']
Write a function that can be used to compute the list of student(s) with the most vowels in their first name. (Hint: use countVowels().)
def mostVowels(wordList):
'''Takes a list of strings wordList and returns a list
of strings from wordList that contain the most # vowels'''
maxSoFar = 0 # initialize counter
result = []
for word in wordList:
count = countVowels(word)
if count > maxSoFar:
# update: found a better word
maxSoFar = count
result = [word]
elif count == maxSoFar:
result.append(word)
return result
# which student(s) has most vowels in their name?
mostVowelNames = mostVowels(firstNames)
mostVowelNames
['Adelaide', 'Giulianna']
We can use our helper function to find out which word(s) have the most number of vowels in other word lists, too.
songWords = []
with open('textfiles/mountains.txt') as song:
for line in song:
songWords.extend(line.strip().split())
mostVowels(songWords)
['mountain',
'peaceful',
'mountains!',
'mountains!',
'rebounding',
'fountains',
'peaceful',
'mountains',
'mountain',
'mountains!',
'mountains!',
'rebounding',
'fountains']
Write a function that can be used to compute the student with the least vowels in their last name.
def leastVowels(wordList):
'''Takes a list of strings wordList and returns a list
of strings in wordList that contain the least number of vowels'''
minSoFar = len(wordList[0]) # initialize counter
result = []
for word in wordList:
count = countVowels(word)
if count < minSoFar:
# update: found a better word
minSoFar = count
result = [word]
elif count == minSoFar:
result.append(word)
return result
leastVowelNames = leastVowels(firstNames)
leastVowelNames
['Jess',
'Will',
'Pat',
'Chan',
'Sam',
'Dan',
'Will',
'Tyler',
'Zach',
'Josh',
'Harry']
Write a function yearList which takes in two arguments rosterList (list of lists) and year (int) and returns the list of students in the class with that graduating year.
def yearList(rosterList, year):
"""Takes the student info as a list of lists and a year (22-25)
and returns a list of students graduating that year"""
return [name[1]+" "+name[0] for name in rosterList if name[2] == str(year)]
juniors = yearList(allStudents, 23)
juniors
['Brandon Paguada',
'Bailey C. Burger-Moore',
'Claudia V. Cantin',
'Kaiser A. Garcia',
'Oliver E. Hall',
'Marla Khishigsuren',
'Sebastian X. Van Der Weide']
Writing to Files¶
We can write all the results that we are computing into a file (a persistent structure). To open a file for writing, we use open
with the mode ‘w’.
The following code will create a new file named studentFacts.txt
in the current working directory and write in it results of our function calls.
fYears = len(yearList(allStudents, 25))
sophYears = len(yearList(allStudents, 24))
jYears = len(yearList(allStudents, 23))
sYears = len(yearList(allStudents, 22))
mostVowelNames = ', '.join(mostVowels(firstNames))
leastVowelNames = ', '.join(leastVowels(firstNames))
with open('studentFacts.txt', 'w') as sFile:
sFile.write('Fun facts about CS134 students:\n')# need newlines
sFile.write('Students with most vowels in their name: {}.\n'.format(mostVowelNames))
sFile.write('Students with least vowels in their name: {}.\n'.format(leastVowelNames))
sFile.write('No. of first years in CS134: {}.\n'.format(fYears))
sFile.write('No. of sophmores in CS134: {}.\n'.format(sophYears))
sFile.write('No. of juniors in CS134: {}\n'.format(jYears))
sFile.write('No. of seniors in CS134: {}\n'.format(sYears))
We can use ls -l
to see that a new file studentFacts.txt
has been created:
ls -l
total 146632
drwxr-xr-x 3 freund staff 96 Mar 7 09:29 __pycache__/
-rw-r--r-- 1 freund staff 20703 Feb 26 13:03 _extra.ipynb
-rw-r--r-- 1 freund staff 13451 Mar 5 14:35 _nested-lists-and-file-writing-starter.ipynb
-rw-r--r-- 1 freund staff 16433 Feb 26 13:03 _nested-lists-and-list-methods-Copy1.ipynb
drwxr-xr-x 6 freund staff 192 Feb 26 13:03 csv/
-rw-r--r-- 1 freund staff 12799 Mar 5 14:35 nested-lists-and-file-writing.ipynb
-rw-r--r-- 1 freund staff 70990960 Mar 5 14:35 nested-lists-and-file-writing.key
-rw-r--r-- 1 freund staff 3996371 Mar 5 14:35 nested-lists-and-file-writing.pdf
-rw-r--r-- 1 freund staff 2083 Feb 26 13:03 sequenceTools.py
-rw-r--r-- 1 freund staff 319 Mar 7 09:29 studentFacts.txt
drwxr-xr-x 4 freund staff 128 Feb 26 13:03 textfiles/
Use the OS command more
to view the contents of the file:
cat studentFacts.txt
Fun facts about CS134 students:
Students with most vowels in their name: Adelaide, Giulianna.
Students with least vowels in their name: Jess, Will, Pat, Chan, Sam, Dan, Will, Tyler, Zach, Josh, Harry.
No. of first years in CS134: 48.
No. of sophmores in CS134: 19.
No. of juniors in CS134: 7
No. of seniors in CS134: 3
Appending to Files¶
If a file already has something in it, opening it in w
mode again will erase all its past contents. If we need to append something to a file, we open it in append a
model. ‘
For example, let us append a sentence to studentFacts.txt
.
with open('studentFacts.txt', 'a') as sFile:
sFile.write('Goodbye.\n')
cat studentFacts.txt
Fun facts about CS134 students:
Students with most vowels in their name: Adelaide, Giulianna.
Students with least vowels in their name: Jess, Will, Pat, Chan, Sam, Dan, Will, Tyler, Zach, Josh, Harry.
No. of first years in CS134: 48.
No. of sophmores in CS134: 19.
No. of juniors in CS134: 7
No. of seniors in CS134: 3
Goodbye.
More Lists of Lists: Ballots!¶
In Lab 4, you’ll work with lists of lists of strings in the form of ballots. An example is shown below.
# different types of coffee
filename = 'csv/coffee.csv'
with open(filename) as coffeeTypes:
allCoffee = []
for coffee in coffeeTypes:
allCoffee.append(coffee.strip().split(','))
allCoffee
[['kona', 'dickason', 'ambrosia', 'wonderbar', 'house'],
['kona', 'house', 'ambrosia', 'wonderbar', 'dickason'],
['kona', 'ambrosia', 'dickason', 'wonderbar', 'house'],
['kona', 'ambrosia', 'wonderbar', 'dickason', 'house'],
['house', 'kona', 'dickason', 'wonderbar', 'ambrosia'],
['kona', 'house', 'dickason', 'ambrosia', 'wonderbar'],
['kona', 'house', 'dickason', 'ambrosia', 'wonderbar'],
['dickason', 'ambrosia', 'wonderbar', 'kona', 'house'],
['house', 'kona', 'ambrosia', 'dickason', 'wonderbar'],
['ambrosia', 'house', 'wonderbar', 'kona', 'dickason'],
['wonderbar', 'ambrosia', 'kona', 'house', 'dickason'],
['house', 'wonderbar', 'kona', 'ambrosia', 'dickason']]
allCoffee[0] # access first "inner" list
['kona', 'dickason', 'ambrosia', 'wonderbar', 'house']
allCoffee[1] # access second inner list
['kona', 'house', 'ambrosia', 'wonderbar', 'dickason']
allCoffee[0][1] # access second element in first inner list
'dickason'
# access second character of second element of first inner list
allCoffee[0][1][1]
'i'
# create list of only last elements of inner lists
lastCoffee = [coffee[-1] for coffee in allCoffee]
lastCoffee
['house',
'dickason',
'house',
'house',
'ambrosia',
'wonderbar',
'wonderbar',
'house',
'wonderbar',
'dickason',
'dickason',
'dickason']
# how many last place votes did wonderbar get?
lastCoffee.count("wonderbar")
3