Lab 4: Every Vote Counts!

In this lab we will learn how to use Python to read in preferential ballot data from sample elections and determine the winner by implementing several different voting rules. In doing so, you will gain experience with the following:

  • Reading data from a csv file, parsing the data using string methods, and storing it in appropriate Python data structures

  • Using two dimensional arrays (or list of lists in Python)

  • Filtering lists through list comprehensions

  • Using convenient list methods: append(), count(), extend(), etc.

  • Iterating using while loops when the stopping condition is not predetermined.

Voting Rules Matter

When a group of people need to collectively make a decision or choose an alternative, they typically do so through voting. The most common and natural voting rule is the plurality rule: count the number of votes received by each candidate and the candidate with the most number of votes wins. While this seems like a pretty reasonable rule, it can lead to some undesirable results. For example, in the 2000 US Presidential Elections, the race was very close and the outcome came down to the state of Florida where the final vote counts (ignoring other candidates) were:

  • Bush: 2,912,790

  • Gore: 2,912,253

  • Nader: 97,488

There was only a ~500 vote difference between Bush and Gore, and it was generally assumed that most people who voted for Nader preferred Gore as their second choice. Thus, Nader was considered a “spoiler” candidate, since his presence flipped the results of the election. On the flip side, if the plurality rule is used, it can de-incentivize truthful voting on the side of the voters: voters are not incentivized to vote for their top candidate if they believe that their candidate has a low likelihood of winning.

To alleviate these issues with plurality voting, many alternative voting rules have been studied and implemented. We will investigate some of these voting rules in this lab.

First, we consider the input: what preferences do we want to elicit from the voters? It is clear from the problems with plurality voting that eliciting just the first choice of candidates can be insufficient. Many voting rules thus ask voters to rank all candidates from their most to least preferred. We will look at such voting rules, which are used not only in government elections, but also for picking winners in sports competitions, Eurovision, the Oscars, etc.

Getting Started

Before you begin, clone this week’s repository in the usual manner.

  1. Open the Terminal and cd into your cs134 directory:

     ::bash
     cd cs134
    
  2. Clone your repository from https://evolene.cs.williams.edu:

     :::bash
     git clone https://evolene.cs.williams.edu/cs134-labs/22xyz3/lab04.git
    

    where 22xyz3 is a place holder for your CS username.

  3. Navigate to your newly created lab04 subdirectory in the Terminal:

     :::bash
     cd lab04
    
  4. Explore the starter code for this lab: you will be implementing helpful methods for voting rules in voting.py and the different voting rules themselves in election.py.

Part 1. Complete the Voting Module

In the first part of this lab, we will implement some useful functions in the module voting.py, which we will eventually use to implement several different voting algorithms.

  1. Read the ballot data:

    Implement the function readBallot() in voting.py which takes a path to a csv file (comma-separated file) as an input string, e.g. 'data/simple.csv', and returns a list of lists of strings, where each “small” list is a single ballot containing candidate names (strings) ordered from most preferred to least preferred. For example, the csv file simple.csv in subdirectory data contains the following four voter preferences (one per line):

    :::bash    
    Aamir,Chris,Beth      
    Beth,Aamir,Chris        
    Chris,Beth,Aamir        
    Aamir,Beth,Chris
    

    Invoking readBallot('data/simple.csv') should return a list of four lists of strings, where each interior list represents the complete preference of a single voter.

    You may test this function in the following ways:

    :::python
    >>> readBallot('data/simple.csv')
    [['Aamir', 'Chris', 'Beth'], ['Beth', 'Aamir', 'Chris'], ['Chris', 'Beth', 'Aamir'], ['Aamir', 'Beth', 'Chris']]
    >>> readBallot('data/characters.csv')[5][3]
    'Scarlett O’Hara'
    >>> readBallot('data/example.csv')[3][1]
    'c'
    

    Please add a meaningful docstring to readBallot(), and include at least two new doctests to test the function.

  2. Collect the first choice votes:

    Implement the function firstChoice() in voting.py which takes a list of lists of strings (e.g., those returned by readBallot()) as input, and then creates and returns a new list of strings containing only the first choice of all voters. This is a good place to use a list comprehension.

    You may test this function in the following ways:

     :::python
     >>> firstChoice(readBallot('data/simple.csv'))
     ['Aamir', 'Beth', 'Chris', 'Aamir']
     >>> firstChoice([['a', 'b'], ['e'], ['f', 'g'], []])
     ['a', 'e', 'f']
    

    Please add a meaningful docstring to firstChoice(), and include at least two new doctests to test the function.

  3. Find unique candidates:

    Implement another function uniques() in voting.py which takes as input a list of strings candidateList (e.g., a list of first choice votes, such as the one returned by the firstChoice() function), and returns a list of strings of the unique candidates that appear in candidateList (in the order that they appear).

    You may test this function as follows:

     :::python
     >>> uniques(['Aamir', 'Beth', 'Chris', 'Aamir'])
     ['Aamir', 'Beth', 'Chris']
     >>> uniques(['a', 'd', 'e', 'a', 'd', 'f', 'c'])
     ['a', 'd', 'e', 'f', 'c']
    

    Please add a meaningful docstring to uniques(), and include at least two new doctests to test the function.

  4. Find the candidates with the most and least first place votes:

    Next, complete the function mostVotes() in voting.py that takes as input a list of strings of names firstChoiceList (e.g., the first choice of all voters as returned by firstChoice()), and returns a list of strings of names that appear the most number of times in firstChoiceList. Analogously, complete the function leastVotes() that takes as input a list of strings of names firstChoiceList (e.g., the first choice of all voters), and returns a list of names that appear the least number of times in the list firstChoiceList.

    You may test these functions in the following ways:

    :::python
    >>> mostVotes(['Aamir', 'Beth', 'Chris', 'Aamir'])
    ['Aamir']
    >>> mostVotes(['a', 'a', 'b', 'b', 'c', 'd', 'e', 'f', 'f'])
    ['a', 'b', 'f']
    >>> leastVotes(['Aamir', 'Beth', 'Chris', 'Aamir'])
    ['Beth', 'Chris']
    >>> leastVotes(['a', 'a', 'b', 'b', 'c', 'd', 'e', 'f', 'f'])
    ['c', 'd', 'e']
    

    (Hint: You may find the list method .count() useful here to count the number of times an element appears in a list. Also, you may find uniques() useful for eliminating duplicates.)

    Please add a meaningful docstring to mostVotes() and leastVotes(), and include at least two new doctests to test each function.

  5. Determine the majority winner:

    Most voting rules agree on the principle that if one candidate receives the majority of first place votes, that candidate should win the election. Often times in real elections with popular choices, however, there is no majority candidate.

    Implement the majority() function in voting.py, that takes as input a list of strings of names firstChoiceList (e.g., the first-choice of all voters), checks if there is a single candidate who wins the majority (i.e., more than half) of the votes, and returns True if so. Otherwise, the function returns False. More precisely, a candidate is a majority winner if and only if they receive more than n//2 first place votes, where n is the total number of votes. You may not use a loop in this function, although you may use other functions in your voting module.

    You may test this function in the following ways:

    :::python
    >>> majority(['Aamir', 'Beth', 'Chris', 'Aamir'])
    False
    >>> majority(['a', 'a', 'a', 'b', 'c', 'd', 'a', 'a', 'f'])
    True
    

    Please add a meaningful docstring to majority(), and include at least two new doctests to test the function.

  6. Eliminate candidates:

    Finally, implement a function eliminateCandidates() in voting.py that takes as input two lists (in order):

    • a list of strings of candidates to be eliminated called eliminationList, and,

    • ballot data as a list of lists of strings called ballots (in the form returned by readBallots()).

    The function must eliminate any votes to candidates in eliminationList and return the updated ballots as a new list of lists of strings. For example, consider a voter’s preference list [Aamir, Chris, Beth]; if Chris is eliminated, the new preference list must be [Aamir, Beth] (that is, Beth moves up to second place to take the spot vacated by Chris.) This is another good place to use a list comprehension.

    You may test this function in the following ways:

    :::python
    >>> eliminateCandidate(['Chris'], readBallot('data/simple.csv'))
    [['Aamir', 'Beth'], ['Beth', 'Aamir'], ['Beth', 'Aamir'], ['Aamir', 'Beth']]
    >>> eliminateCandidate(['Samwise Gamgee', 'Elizabeth Bennet'],readBallot('data/characters.csv')[0:3])
    [['Harry Potter', 'Scarlett O’Hara'], ['Harry Potter', 'Scarlett O’Hara'], ['Scarlett O’Hara', 'Harry Potter']]
    

    Please add a meaningful docstring to eliminateCandidates(), and include at least two new doctests to test the function.

  7. Test your module:

    Before moving on, thoroughly test the functions implemented in voting.py. Make sure to update the docstring at the top of the file, as well as the __all__ special variable. Also be sure you have added meaningful comments to your code.

When you are done with Part 1, remember to add, commit and push your work in voting.py, along with honorcode.txt, to evolene by Oct 6/7 @ 10 pm.

After finishing your voting module, you may move on to Part 2.

Part 2. Implement Voting Algorithms

Now that we have our helper functions implemented in the voting module, we are ready to implement several different voting algorithms in elections.py.

We will use a running example involving the ballots given in data/example.csv to explain each voting rule, and show that each of them gives a different winner.

A summary of the ballot data from data/example.csv, with three candidates (a, b, and c) and 21 voters, is shown below:

|# Ballots | Ranking  |
|----------|----------|
|     7    | a, c, b  |
|     7    | b, c, a  |
|     6    | c, b, a  |
|     1    | a, b, c  |
  1. Find the plurality winner:

    We start with the most common voting rule. Complete the function plurality() in elections.py which takes as input the ballot data ballots as a list of lists of strings (in the form returned by readBallot()), and returns a list of strings consisting of the name(s) of the candidate(s) who receives the most number of votes. (Note: In the case of ties, there may be more than one winner!)

    In the example above (from data/example.csv), candidate a would win the plurality election with 8 first place votes.

    You must use the functions provided in the voting module. Do not use any loops.

    You may test this function in the following ways:

    :::python
    >>> plurality(readBallot('data/simple.csv'))
    ['Aamir']
    >>> plurality(readBallot('data/example.csv'))
    ['a']
    >>> plurality(readBallot('data/characters.csv'))
    ['Scarlett O’Hara', 'Samwise Gamgee']
    
  2. Find the Borda winner:

    Another well known voting rule that is often used in sports as well as in the Eurovision song contest is the Borda count. In this rule, each candidate gets a “total score” determined as follows. Suppose there are n candidates and each voter gives a strict ranking to all candidates. A candidate gets n points for each first-place vote, n-1 points for each second place vote, and so on, down to 1 point for each last place vote. The Borda score of each candidate is the total points they receive and the candidate(s) with the most points wins.

    In the example above (from data/example.csv), candidate a received 8 first place votes and 13 last place votes. b received 7 first place votes, 7 second place votes, and 7 third place votes. c received 6 first place votes, 14 second place votes, and 1 third place vote. Thus, the Borda rule would assign the following scores:

    • a : 8·3 + 13·1 = 37

    • b : 7·3 + 7·2 + 7·1 = 42

    • c : 6·3 + 14·2 + 1·1 = 47

    Thus, by the Borda rule, c would win the election (in contrast to a as the plurality winner.)

    Complete the function borda() in elections.py which takes as input the ballot data ballots as a list of lists of strings (in the form returned by readBallot()), computes the Borda score of each candidate, and returns a list of strings of the winner(s): that is, the candidates(s) who receive the maximum Borda score. You may assume that each list in ballots includes all candidates (that is, there are no “undervotes”: every voter provides a complete ranking of all candidates). You may use loops this time, and the most elegant solutions will take advantage of functions in the voting module. When implementing this algorithm, it may be helpful to think about how you can represent a candidate’s Borda score in a list that contains only candidate names (as strings).

    You may test this function in the following ways:

    :::python
    >>> borda(readBallot('data/simple.csv')
    ['Aamir']
    >>> borda(readBallot('data/example.csv'))
    ['c']
    >>> borda(readBallot('data/characters.csv'))
    ['Harry Potter']
    
  3. Find the ranked-choice voting winner:

    Recently, ranked-choice voting, or instant run-off, has gained popularity. Massachusetts had it on the ballot during the 2020 elections. New York conducted its much talked-about mayoral election using ranked-choice voting in summer 2021. The idea behind ranked-choice voting is to give voters more expressive power by taking into account how they rank each candidate (rather than just their top choice). The voting rule works as follows:

    Step 1. If there is a candidate who receives majority (more than half) of the first-place votes, then this candidate wins and the election ends.

    Step 2. If there is no majority winner (no voter receives more than half of the votes), the candidate with the fewest number of first-place votes is eliminated from the election, with candidates ranked below moving up to take their spot. If more than one candidate receives the lowest number of votes, then all such candidates are eliminated.

    Step 3. Repeat Steps 1 and 2 until there is a majority winner in Step 1, or until the elimination round in Step 2 would result in eliminating all remaining candidates (which is considered a tied election). In this case, all remaining candidates are considered the winners and should be returned.

    Let us apply this process to the example in example.csv. As a reminder, here is the starting ballot data again (same as above, repeated for convenience):

    |# Ballots | Ranking  |
    |----------|----------|
    |     7    | a, c, b  |
    |     7    | b, c, a  |
    |     6    | c, b, a  |
    |     1    | a, b, c  |
    

    Starting with Step 1, we see that a has 8 first place votes, b has 7, and c has 6. Although a has the most first-place votes, 8 is not a majority of 21. Thus there is no majority winner, and we move on to Step 2 in our algorithm. Since c has the fewest first-place votes, it is eliminated first. After removing c, the updated ballot summary is:

    |# Ballots | Ranking |
    |----------|---------|
    |     8    |   a, b  |
    |     13   |   b, a  |
    

    Step 3 tells us that we must repeat the process, and return to Step 1. Now, b is a majority winner, since 13 > 21 // 2, and thus b wins the election. (Notice that all three voting rules give a different winner on this example ballot summary!)

    Complete the function rankedChoice() in elections.py that implements the ranked-choice voting rule: given ballot data as list of lists of strings, it should return a list of strings of candidate(s) that win the election based on ranked-choice voting. Use the helper functions implemented in voting.py.

    You may test this function in the following ways:

    :::python
    >>> rankedChoice(readBallot('data/simple.csv'))
    ['Aamir']
    >>> rankedChoice(readBallot('data/example.csv'))
    ['b']
    >>> rankedChoice(readBallot('data/characters.csv'))
    ['Scarlett O’Hara']
    

Update 10/12 (Begin)

Important note: The algorithm described above is a simplification of the actual ranked choice algorithm. In particular, the above algorithm will not work correctly for candidates who receive no first place votes. For our purposes, you may assume that each remaining candidate receives at least one first-place vote. The example data and test cases provided satisfy this assumption, and we will only test your code on such examples. (Note, however, you should still consider test cases such as empty ballots, e.g., [[], []].)

(Optional) Generalized Ranked-Choice Algorithm:
To generalize to all types of ballots (including those where a candidate may receive zero first-place votes), we need to do an extra step. Consider the following scenario:

|# Ballots | Ranking   |
|----------|-----------|
|     2    |   a, b, c |
|     2    |   b, c, a |

Notice that there is no majority candidate, so we need to move on to the elimination step. In this example, candidate c did not get any first-place votes. However, since the simplified algorithm described above looks to eliminate candidates with the least number of first place votes, our function leastVotes would be called on firstChoice(ballots), and ['a', 'b'] would be eliminated. This is incorrect, though, since c received 0 first-place votes, and thus should be eliminated first. To handle such corner cases correctly, modify your rankedChoice function to eliminate all candidates who receive zero-first place votes first. (This a good opportunity to define another helper function in our voting module. Do not modify the leastVotes function.) After such candidates are eliminated, we can proceed with the usual procedure as described above.

The updated ballots (after eliminating c) in our example is:

|# Ballots | Ranking |
|----------|---------|
|     2    |   a, b  |
|     2    |   b, a  |

At this point, a and b are tied. Calling leastVotes on firstChoice(ballots) returns ['a', 'b']. Eliminating both would result in eliminating all candidates, at which point, the function should return the winners as ['a', 'b'].

Update 10/12 (End)

  1. (Extra Credit) Find the Condorcet winner:

    A Condorcet winner is a candidate who wins a majority of the vote in a head-to-head election against every other candidate. In particular, we say candidate a beats b if a majority of voters prefer a to b. Candidate a is a Condorcet winner if a beats every other candidate. Note that a Condorcet winner may not always exist.1

    In our running example in example.csv, c beats both a and b in a head-to-head race (with a score of 13-8 in both cases), and thus c is the Condorcet winner.

    Write a function condorcet() in elections.py that takes ballot data as a list of lists of strings, and returns the name as a string of the Condorcet winner (if a Condorcet winner exists), else the function should return an empty string.

  2. Running Elections:

    In Homework 1, you may recall that we asked you to vote for your favorite ice cream flavor. Now is the time to elect a winning flavor through each of the voting methods! Uncomment the print statements provided in the if __name__ == '__main__': block at the end of election.py and see which flavor wins for the different voting rules. Since there is a dominant first choice (we won’t reveal which one!) that all voting methods agree on, eliminate that flavor, and rerun the elections to see who wins using each voting rule.

Submitting Your Work and Grading Guidelines

  1. Part 1 of this lab (complete voting.py) is due on Oct 6/7 at 10pm, and the entire lab (completed voting.py and election.py) is due on Oct 13/14 at 10pm. We have created these checkpoints to balance the workload across the two weeks for you, the TAs, and the faculty.

  2. Use the helper functions defined in voting.py whenever possible. One of the main advantages of functions is enabling code reuse, and avoiding redundancy.

  3. This lab assignment provides several opportunities to use list comprehensions: to get some practice with them, you must write at least one list comprehension in either voting.py or election.py. We have noted a few obvious places to try incorporating a list comprehension in this handout.

  4. Do not modify function names or interpret parameters differently! Make sure your functions follow the expected behavior in terms of type of input and output: if they return lists, their default return type must always be list. A function’s documentation serves, in some way, as a contract between you and your users. Deviating from this contract makes it hard for potential users to adopt your implementation!

  5. Functionality and programming style are important, just as both the content and the writing style are important when writing an essay. Make sure your variables are named well, and your use of comments, white space, and line breaks promote readability. We expect to see code that makes your logic as clear and easy to follow as possible. A Python Style Guide is available on the course website to help you with stylistic decisions.

Good luck! Do not forget to add, commit, and push your work as it progresses! Test your code often to simplify debugging. Remeber to certify your work by signing the honorcode.txt file (once when you turn in voting.py for Part 1, and again when submit election.py for Part 2.)


1

It is possible for that among three candidates: a defeats b, b defeats c, and c defeats a with no candidate beating every other—this is called the Condorcet Paradox.)