Nested Lists and Writing to Files

In the last lecture, we introduced file reading using the with ... as block. Today, we will focus on reading CSV files, storing the data as list of lists, and analyzing it using list and string methods to compute important properties. We’ll also briefly look at writing output to files.

We have written a few helper functions for working with strings, which are now in a module sequenceTools:

  • isVowel(character): returns True or False if character is a vowel

  • countVowels(word): returns number of vowels in word (int)

  • wordStartEnd(wordList): Takes a list of words and returns the list of words in it that start and end with the same letter

  • palindromes(wordList): Takes a list of words and returns the list of words in it that are palindromes

We will use them as we work through the some examples.

from sequenceTools import *
# with a list comprehension!
filename = 'csv/classnames.csv' 
with open(filename) as roster:
    allStudents = [student.strip().split(',') for student in roster] 
allStudents # list of lists of strings
[['Aleman-Valencia', 'Karla', '25', 'ka14'],
 ['Batsaikhan', 'Munguldei', '25', 'mb34'],
 ['Berger', 'Marcello W.', '25', 'mwb3'],
 ['Bertolet', 'Jeremy S.', '24', 'jsb7'],
 ['Bhaskar', 'Monika A.', '25', 'mab13'],
 ['Blair', 'Maycie C.', '25', 'mcb12'],
 ['Brown', 'Courtney A.', '22', 'cab10'],
 ['Christ', 'Alexander M.', '22', 'amc11'],
 ['Gonzalez', 'Gabriela M.', '24', 'gmg7'],
 ['Herman', 'Adelaide A.', '25', 'aah6'],
 ['Hu', 'Jess', '25', 'jhh3'],
 ['Huang', 'Will', '24', 'wh4'],
 ['Jain', 'Divij', '25', 'dj4'],
 ['Kirtane', 'Jahnavi N.', '24', 'jnk1'],
 ['Kluev', 'Varya A.', '25', 'vak1'],
 ['Klugman', 'Pat T.', '25', 'ptk2'],
 ['Knight Garcia I', 'Grace P.', '24', 'gpk1'],
 ['Kolean', 'Owen A.', '25', 'oak2'],
 ['Lee', 'Chan', '24', 'cjl5'],
 ['Liang', 'Nathan S.', '25', 'nsl3'],
 ['Liu', 'Karen', '25', 'kl14'],
 ['Loftus', 'Andrew W.', '25', 'awl5'],
 ['Louchheim', 'Carter H.', '25', 'chl2'],
 ['Lurbur', 'Hadassah N.', '24', 'hnl2'],
 ['Magid', 'Sam P.', '25', 'sm39'],
 ['Miller', 'Jakin J.', '24', 'jjm5'],
 ['Moon', 'Chisang', '25', 'cm33'],
 ["O'Connor", 'Dan P.', '25', 'dpo2'],
 ['Olsen', 'Will T.', '25', 'wto2'],
 ['Paguada', 'Brandon', '23', 'bp9'],
 ['Park', 'Abraham S.', '22', 'asp8'],
 ['Polanco', 'Isabella G.', '25', 'igp1'],
 ['Poll', 'Noah D.', '24', 'ndp2'],
 ['Ratcliffe', 'Christopher K.', '24', 'ckr1'],
 ['Singh', 'Jaskaran', '25', 'js34'],
 ['Smith', 'Tyler C.', '24', 'tcs3'],
 ['Sturdevant', 'Zach S.', '25', 'zss1'],
 ['Tantum', 'Charlie J.', '25', 'cjt3'],
 ['Verkleeren', 'Sophia A.', '25', 'sav3'],
 ['Villanueva Astilleros', 'Paola C.', '25', 'pcv1'],
 ['Zhang', 'Alison Y.', '24', 'ayz2'],
 ['Anderson', 'Carter R.', '25', 'cra3'],
 ['Berger', 'Elissa J.', '25', 'ejb5'],
 ['Brissett', 'Keel M.', '25', 'kmb9'],
 ['Bruce', 'Giulianna', '25', 'gb12'],
 ['Bruns', 'Josh A.', '25', 'jab17'],
 ['Burger-Moore', 'Bailey C.', '23', 'bcb3'],
 ['Cantin', 'Claudia V.', '23', 'cvc2'],
 ['Cazabal', 'Victor M.', '25', 'vmc3'],
 ['Cecchi-Rivas', 'Fior D.', '25', 'fdc1'],
 ['Cook', 'Major C.', '24', 'mcc8'],
 ['Damra', 'Tala H.', '25', 'thd2'],
 ['Dawson', 'Quinn N.', '25', 'qnd1'],
 ['Dhingra', 'Ronak A.', '25', 'rad6'],
 ['Foisy', 'Sylvain J.', '24', 'sjf3'],
 ['Freund', 'Avery G.', '25', 'agf1'],
 ['Galizio', 'Riley S.', '24', 'rsg2'],
 ['Garcia', 'Kaiser A.', '23', 'kag6'],
 ['Gustafson', 'Annie H.', '24', 'ahg2'],
 ['Hall', 'Oliver E.', '23', 'oeh1'],
 ['Huang', 'Spencer B.', '24', 'sbh1'],
 ['Khishigsuren', 'Marla', '23', 'mk22'],
 ['Kovalski', 'Lola G.', '25', 'lgk1'],
 ['Laesch', 'Greta M.', '25', 'gml2'],
 ['Loyd', 'Eddie G.', '25', 'egl2'],
 ['Miotto', 'Joe', '24', 'jdm9'],
 ['Murray', 'James D.', '25', 'jdm10'],
 ['Nakato', 'Rika', '24', 'rn6'],
 ['Pandey', 'Himal R.', '25', 'hrp3'],
 ['Payel', 'Maruf', '25', 'mp19'],
 ['Pedlow', 'Harry J.', '25', 'hjp2'],
 ['Pujara', 'Shivam', '25', 'smp6'],
 ['Resch', 'Prairie C.', '25', 'pcr1'],
 ['Schumann', 'Jesse H.', '25', 'jhs2'],
 ['Vaccaro', 'William', '25', 'wav1'],
 ['Van Der Weide', 'Sebastian X.', '23', 'sxv1'],
 ['Wongibe', 'Bernard V.', '25', 'bvw1']]
size = len(allStudents) # number of students in class
size
77

List Comprehension Exercises

Let’s complete the following tasks using list comprehensions.

  1. Generate a list of only student last names.

# generate list of only student last names
lastNames = [s[0] for s in allStudents]
lastNames
['Aleman-Valencia',
 'Batsaikhan',
 'Berger',
 'Bertolet',
 'Bhaskar',
 'Blair',
 'Brown',
 'Christ',
 'Gonzalez',
 'Herman',
 'Hu',
 'Huang',
 'Jain',
 'Kirtane',
 'Kluev',
 'Klugman',
 'Knight Garcia I',
 'Kolean',
 'Lee',
 'Liang',
 'Liu',
 'Loftus',
 'Louchheim',
 'Lurbur',
 'Magid',
 'Miller',
 'Moon',
 "O'Connor",
 'Olsen',
 'Paguada',
 'Park',
 'Polanco',
 'Poll',
 'Ratcliffe',
 'Singh',
 'Smith',
 'Sturdevant',
 'Tantum',
 'Verkleeren',
 'Villanueva Astilleros',
 'Zhang',
 'Anderson',
 'Berger',
 'Brissett',
 'Bruce',
 'Bruns',
 'Burger-Moore',
 'Cantin',
 'Cazabal',
 'Cecchi-Rivas',
 'Cook',
 'Damra',
 'Dawson',
 'Dhingra',
 'Foisy',
 'Freund',
 'Galizio',
 'Garcia',
 'Gustafson',
 'Hall',
 'Huang',
 'Khishigsuren',
 'Kovalski',
 'Laesch',
 'Loyd',
 'Miotto',
 'Murray',
 'Nakato',
 'Pandey',
 'Payel',
 'Pedlow',
 'Pujara',
 'Resch',
 'Schumann',
 'Vaccaro',
 'Van Der Weide',
 'Wongibe']
  1. Generate a list of only student first names.

# List comprehension to generate a list of first names 
# (without middle initial)
firstNames = [s[1].split()[0] for s in allStudents]
firstNames
['Karla',
 'Munguldei',
 'Marcello',
 'Jeremy',
 'Monika',
 'Maycie',
 'Courtney',
 'Alexander',
 'Gabriela',
 'Adelaide',
 'Jess',
 'Will',
 'Divij',
 'Jahnavi',
 'Varya',
 'Pat',
 'Grace',
 'Owen',
 'Chan',
 'Nathan',
 'Karen',
 'Andrew',
 'Carter',
 'Hadassah',
 'Sam',
 'Jakin',
 'Chisang',
 'Dan',
 'Will',
 'Brandon',
 'Abraham',
 'Isabella',
 'Noah',
 'Christopher',
 'Jaskaran',
 'Tyler',
 'Zach',
 'Charlie',
 'Sophia',
 'Paola',
 'Alison',
 'Carter',
 'Elissa',
 'Keel',
 'Giulianna',
 'Josh',
 'Bailey',
 'Claudia',
 'Victor',
 'Fior',
 'Major',
 'Tala',
 'Quinn',
 'Ronak',
 'Sylvain',
 'Avery',
 'Riley',
 'Kaiser',
 'Annie',
 'Oliver',
 'Spencer',
 'Marla',
 'Lola',
 'Greta',
 'Eddie',
 'Joe',
 'James',
 'Rika',
 'Himal',
 'Maruf',
 'Harry',
 'Shivam',
 'Prairie',
 'Jesse',
 'William',
 'Sebastian',
 'Bernard']

Student Fun Facts!

Let’s use our student list to get some practice with lists of lists and useful string and list methods.

  1. Write a function characterList which takes in two arguments rosterList (list of lists) and character (a string) and returns the list of students in the class whose name starts with character.

def characterList(rosterList, character):
    """Takes the student info as a list of lists and a 
    string character and returns a list of students whose 
    first name starts with character"""
    return [name[1] for name in rosterList if name[1][0] == character]
characterList(allStudents, "B")
['Brandon', 'Bailey C.', 'Bernard V.']
  1. Write a function that can be used to compute the list of student(s) with the most vowels in their first name. (Hint: use countVowels().)

def mostVowels(wordList):
    '''Takes a list of strings wordList and returns a list
    of strings from wordList that contain the most # vowels'''
    
    maxSoFar = 0 # initialize counter
    result = []
    for word in wordList:
        count = countVowels(word)
        if count > maxSoFar:
            # update: found a better word
            maxSoFar = count
            result = [word] 

        elif count == maxSoFar:  
            result.append(word)
    return result
# which student(s) has most vowels in their name?
mostVowelNames = mostVowels(firstNames)  
mostVowelNames
['Adelaide', 'Giulianna']
  1. We can use our helper function to find out which word(s) have the most number of vowels in other word lists, too.

songWords = []
with open('textfiles/mountains.txt') as song:  
    for line in song:
        songWords.extend(line.strip().split())

mostVowels(songWords)
['mountain',
 'peaceful',
 'mountains!',
 'mountains!',
 'rebounding',
 'fountains',
 'peaceful',
 'mountains',
 'mountain',
 'mountains!',
 'mountains!',
 'rebounding',
 'fountains']
  1. Write a function that can be used to compute the student with the least vowels in their last name.

def leastVowels(wordList):
    '''Takes a list of strings wordList and returns a list
    of strings in wordList that contain the least number of vowels'''
    minSoFar = len(wordList[0]) # initialize counter
    result = []
    for word in wordList:
        count = countVowels(word)
        if count < minSoFar:
            # update: found a better word
            minSoFar = count
            result = [word] 

        elif count == minSoFar:  
            result.append(word)
    return result
leastVowelNames = leastVowels(firstNames)
leastVowelNames
['Jess',
 'Will',
 'Pat',
 'Chan',
 'Sam',
 'Dan',
 'Will',
 'Tyler',
 'Zach',
 'Josh',
 'Harry']
  1. Write a function yearList which takes in two arguments rosterList (list of lists) and year (int) and returns the list of students in the class with that graduating year.

def yearList(rosterList, year):
    """Takes the student info as a list of lists and a year (22-25)
    and returns a list of students graduating that year"""
    return [name[1]+" "+name[0] for name in rosterList if name[2] == str(year)]
juniors = yearList(allStudents, 23)
juniors
['Brandon Paguada',
 'Bailey C. Burger-Moore',
 'Claudia V. Cantin',
 'Kaiser A. Garcia',
 'Oliver E. Hall',
 'Marla Khishigsuren',
 'Sebastian X. Van Der Weide']

Writing to Files

We can write all the results that we are computing into a file (a persistent structure). To open a file for writing, we use open with the mode ‘w’.

The following code will create a new file named studentFacts.txt in the current working directory and write in it results of our function calls.

fYears = len(yearList(allStudents, 25))
sophYears = len(yearList(allStudents, 24))
jYears = len(yearList(allStudents, 23))
sYears = len(yearList(allStudents, 22))
mostVowelNames = ', '.join(mostVowels(firstNames))
leastVowelNames = ', '.join(leastVowels(firstNames))

with open('studentFacts.txt', 'w') as sFile:
    sFile.write('Fun facts about CS134 students:\n')# need newlines
    sFile.write('Students with most vowels in their name: {}.\n'.format(mostVowelNames))
    sFile.write('Students with least vowels in their name: {}.\n'.format(leastVowelNames))
    sFile.write('No. of first years in CS134: {}.\n'.format(fYears))
    sFile.write('No. of sophmores in CS134: {}.\n'.format(sophYears))
    sFile.write('No. of juniors in CS134: {}\n'.format(jYears))
    sFile.write('No. of seniors in CS134: {}\n'.format(sYears))

We can use ls -l to see that a new file studentFacts.txt has been created:

ls -l
total 146632
drwxr-xr-x  3 freund  staff        96 Mar  7 09:29 __pycache__/
-rw-r--r--  1 freund  staff     20703 Feb 26 13:03 _extra.ipynb
-rw-r--r--  1 freund  staff     13451 Mar  5 14:35 _nested-lists-and-file-writing-starter.ipynb
-rw-r--r--  1 freund  staff     16433 Feb 26 13:03 _nested-lists-and-list-methods-Copy1.ipynb
drwxr-xr-x  6 freund  staff       192 Feb 26 13:03 csv/
-rw-r--r--  1 freund  staff     12799 Mar  5 14:35 nested-lists-and-file-writing.ipynb
-rw-r--r--  1 freund  staff  70990960 Mar  5 14:35 nested-lists-and-file-writing.key
-rw-r--r--  1 freund  staff   3996371 Mar  5 14:35 nested-lists-and-file-writing.pdf
-rw-r--r--  1 freund  staff      2083 Feb 26 13:03 sequenceTools.py
-rw-r--r--  1 freund  staff       319 Mar  7 09:29 studentFacts.txt
drwxr-xr-x  4 freund  staff       128 Feb 26 13:03 textfiles/

Use the OS command more to view the contents of the file:

cat studentFacts.txt
Fun facts about CS134 students:
Students with most vowels in their name: Adelaide, Giulianna.
Students with least vowels in their name: Jess, Will, Pat, Chan, Sam, Dan, Will, Tyler, Zach, Josh, Harry.
No. of first years in CS134: 48.
No. of sophmores in CS134: 19.
No. of juniors in CS134: 7
No. of seniors in CS134: 3

Appending to Files

If a file already has something in it, opening it in w mode again will erase all its past contents. If we need to append something to a file, we open it in append a model. ‘

For example, let us append a sentence to studentFacts.txt.

with open('studentFacts.txt', 'a') as sFile:
    sFile.write('Goodbye.\n')
cat studentFacts.txt 
Fun facts about CS134 students:
Students with most vowels in their name: Adelaide, Giulianna.
Students with least vowels in their name: Jess, Will, Pat, Chan, Sam, Dan, Will, Tyler, Zach, Josh, Harry.
No. of first years in CS134: 48.
No. of sophmores in CS134: 19.
No. of juniors in CS134: 7
No. of seniors in CS134: 3
Goodbye.

More Lists of Lists: Ballots!

In Lab 4, you’ll work with lists of lists of strings in the form of ballots. An example is shown below.

# different types of coffee
filename = 'csv/coffee.csv' 
with open(filename) as coffeeTypes:
    allCoffee = []
    for coffee in coffeeTypes:
        allCoffee.append(coffee.strip().split(','))
allCoffee
[['kona', 'dickason', 'ambrosia', 'wonderbar', 'house'],
 ['kona', 'house', 'ambrosia', 'wonderbar', 'dickason'],
 ['kona', 'ambrosia', 'dickason', 'wonderbar', 'house'],
 ['kona', 'ambrosia', 'wonderbar', 'dickason', 'house'],
 ['house', 'kona', 'dickason', 'wonderbar', 'ambrosia'],
 ['kona', 'house', 'dickason', 'ambrosia', 'wonderbar'],
 ['kona', 'house', 'dickason', 'ambrosia', 'wonderbar'],
 ['dickason', 'ambrosia', 'wonderbar', 'kona', 'house'],
 ['house', 'kona', 'ambrosia', 'dickason', 'wonderbar'],
 ['ambrosia', 'house', 'wonderbar', 'kona', 'dickason'],
 ['wonderbar', 'ambrosia', 'kona', 'house', 'dickason'],
 ['house', 'wonderbar', 'kona', 'ambrosia', 'dickason']]
allCoffee[0] # access first "inner" list
['kona', 'dickason', 'ambrosia', 'wonderbar', 'house']
allCoffee[1] # access second inner list 
['kona', 'house', 'ambrosia', 'wonderbar', 'dickason']
allCoffee[0][1] # access second element in first inner list
'dickason'
# access second character of second element of first inner list 
allCoffee[0][1][1] 
'i'
# create list of only last elements of inner lists
lastCoffee = [coffee[-1] for coffee in allCoffee] 
lastCoffee
['house',
 'dickason',
 'house',
 'house',
 'ambrosia',
 'wonderbar',
 'wonderbar',
 'house',
 'wonderbar',
 'dickason',
 'dickason',
 'dickason']
# how many last place votes did wonderbar get?
lastCoffee.count("wonderbar")
3