Python Style Guide

Python Style Guide

Written by Duane A. Bailey, and updated by Jeannie Albrecht, Lida Doret, Kelly Shaw, Shikha Singh, Stephen Freund, and Bill Jannen.

Implementations of algorithms—programs—can be beautiful. We make programs beautiful because they will be appreciated, later, by some reader. That may be you. If you’re a tool-builder, users will view your code by what they read. Good programming languages make it easier to cast our abstract ideas as programs, but they also form an effective medium for human communication. Python is no exception.

Part of what makes a beautiful algorithm implementation is just its ability to leverage the syntax of the programming language to convey meaning. When an algorithm is written well, the language’s presentation facilitates, rather than impedes understanding. Part of the process of learning a language is to learn the common practices and coding idioms. Programming is a social discourse and its idioms efficiently convey not only program meaning, but programmer intention.

This document is about coding practices—“pythonic” approaches—to writing code that facilitates understanding. These are cast as a series of rules, to focus the attention. Straying from these guidelines can have the effect of making your code harder for the reader to understand.

Still, Orwell might have proposed: break any rule should it lead to a design not worth defending.

Section 1: The First Rule

Rule 1.1: Use only Python version 3.

Python version 2 does not include features we depend on to write scalable and correct programs.

Section 2: Naming

This section discusses how we name things or objects in Python. Often, the name given to a value is our only hint at its meaning. Good names make good programs.

Rule 2.1: Limit names to alphabetic and numeric characters.

This restriction makes your programs more readable and maintainable. In fact, Python won’t allow you to use names involving some special characters.

pi
rstrip
reader
atan2
x-1
2-y
a&
b*

Rule 2.2: Use meaningful names. Single characters are allowed in for and with expressions.

Thoughtful naming conveys meaning.

answer
position
opponent
word_list
a
p
o
aFile

Rule 2.3: Use “snake case” to improve readability of multiword identifiers.

Separate successive words within a name with the underscore character, e.g., _. A consistent and compact approach makes multiword identifiers more readable and understandable.

mascot_image
purple_cow_motto
disk_read
inputFile
diskread

Rule 2.4: Only use underscores to indicate privacy or special variables.

Underscores can legally appear anywhere within names, but they can convey specific information. Underscores should be avoided at the beginning of generic variable names unless intentionally used to convey information about the variable’s usage. The lone underscore (_) is reserved to mean “this value is unimportant”.

__name__   # conveys specific meaning about variable __name__
p._x       # conveys the fact that x is a private member variable
_          # unimportant value
_life_as_usual
yourAnswerIs____

Rule 2.5: When naming constant variables, use capital letters.

By naming constant variables using only capital letters, it makes it easy to visually distinguish constants from variables whose values are subject to change. This visual cue decreases the likelihood of unintended modification of constant variables’ values.

PI = 3.14
MONTHS = 12
pi = 3.14
Months = 12

Section 3: Whitespace

Spaces, tabs, and end-of-line marks (collectively called, whitespace) are used, as in writing, to delimit words and phrases. In Python, whitespace also plays an important role in structuring programs.

Rule 3.1: Indent consistently using only spaces.

Python uses consistent indentation to identify suites of related statements. Python treats tab characters as subtly different than spaces. Some editors may attempt to substitute several spaces with a single tab. This worries Python, which will warn of inconsistent indentation.

Rule 3.2: Use blank lines to separate definitions. Use blank lines, sparingly, within definitions.

Blank lines help the reader identify the start of a new definition. Unused lines can also break longer suites of statements into logical parts. Multiple, adjacent blank lines, however, tend to make definitions appear scattered.

Rule 3.3: Use spaces as you would with human language punctuation.

Spaces and punctuation in programs serve a similar purpose to punctuation in human language. Use single spaces to separate conceptual units of the program, but do not use spaces where you would not find them in a human language. Some feel that using spaces around operators in expressions can make the “mathematics” clearer. Your choice to use spaces to emphasize operators—or not—should be consistent throughout your code.

def fibo(n, a=1, b=1):
    """Compute the n-th value in the Fibonacci sequence:
    fibo(0) = a, fibo(1) = b, fibo(n) = fibo(n-2)+fibo(n-1)."""
    for _ in range(n):
        a, b = b, a+b
    return a
def gcd(a,b) :
    """ Compute the greatest common integer divisor of a and b. """
    if a> b:
        result = gcd( b , a)
    elif a ==0 :
        result = b
    else
        result=gcd(b%a,a)
    return result

Rule 3.4: Place each statement on a dedicated line. Do not use semicolons.

Python prefers one statement per line. While semicolons (;) can be used to separate statements, it tends to make it appear statements are to be executed concurrently, when, in fact, they are executed sequentially. Avoid semicolons.

ratio = a//b
remainder = a%b
a = b; b = a   # *not* the same as a,b = b,a

Section 4: Code Density

Part of Python’s feeling of expressiveness is its succinctness. Still, succinctness can lead to a feeling of density; we need to be careful to make sure our Python statements are manageable, consumable chunks of logic.

Rule 4.1: Limit lines to 80 characters. Use parentheses to delimit longer expressions.

Because Python’s statements are line-oriented, it is important to make sure that readers are not confused by long lines wrapped by editors or printers. Either break up long lines into several shorter statements or manually break lines to make reading more straightforward.

Rule 4.2: Consistently use quotes (’hello’) or quotations ("hello") around strings.

Python has two alternative ways to delimit strings: single apostrophes (or quotes) and quotation marks. Pick one style of delimiting strings, and use it consistently throughout your code.

print("So, " + name + ", do you want to play again?")
print("The lazy fox jumps over the quick " + dog_color + ' dog.')

Rule 4.3: Use parentheses only when necessary.

Parentheses in expressions (()) indicate the desired order of evaluation. Wrapping entire expressions in parentheses is typically unnecessary. In particular, if, while, return, and yield statements do not require parentheses around their expressions. Excessive use of parentheses can impact program readability.

if b*b >= 4*a*c:
    radical = (b*b - 4*a*c)**0.5
else:
    print("Roots are imaginary.")
def print_binary(n):
    if (n < 2):
        print((n%2))
    else:
        print_binary((n//2))
        print_binary((n%2))

Section 5: Documentation

These rules govern how we document programs. Documentation helps readers understand what code does and how it’s used. In Python, good documentation can support testing.

Rule 5.1: Every module, class, and function should contain an initial docstring.

When the statement in a function, class, or module is an unused string, Python interprets that string as documentation, called a docstring. Python documentation is available through the pydoc3 command. Try, for example,

pydoc3 math

Inside the interpreter, documentation about an object is easily retrieved using the help method:

>>> help(print)

Programmers who build useful tools describe how they are used.

One important exception is the documentation of the short, one-off lambda functions. These functions are short enough to be easily understood. (If not, they should be declared using def.)

def randint(low, high):
    """Pick a random integer between low and high (inclusive)."""
    return low + abs(random()) % (high-low+1) # includes high

Documentation for definitions of identifiers that begin with underscores are not implicitly volunteered by pydoc3. Programmers should still provide docstrings for these definitions because (1) their documentation may still be explicitly requested and (2) making a definition public, later, should not expose undocumented features.

Rule 5.2: Use triple-quotations (""") to delimit docstrings.

Because docstrings typically span multiple lines, it is universal practice to use three quotation marks (""") to delimit this text, even if it is currently a single line.

Rule 5.3: Assume programs are read from top to bottom.

Comments beginning with the hash mark (#) describe code on a particular line, or, if standalone, on lines that follow. Where code is necessarily obscure or complex, it is helpful to provide a comment to facilitate the reader’s understanding. Comments that appear on a line with code describe the line. Comments that appear on dedicated lines describe the code that follows.

# compute the real roots of the polynomial a*x**2+b*x+c:
radical = (b*b-4*a*c)**0.5     # using multiplication to compute b**2
root1 = (-b - radical)/(2*a)   # left root
root2 = (-b + radical)/(2*a)   # right root
# Is "year" a leap year?
mult4 = (year % 4) == 0
# year is divisible by 4
century = (year % 100) == 0
mult400 = (year % 400) == 0
isLeap = mult4 and ((not century) or mult400)
# ie.  a year is a leap year if it is divisible
# by 4, typically.  years divisible by 100
# are not leap years, UNLESS they are divisible by 400
# e.g. 2000 was a leap year, but 2100 won't be.

Rule 5.4: In docstrings, demonstrate example usage using doctest format.

Python supports a wide variety of testing strategies, including unit tests. The best docstrings include examples of function or method use.

def gcd(a, b):
    """Compute the greatest common divisor of integers a and b.

    >>> gcd(0, 0)
    0
    >>> gcd(0, 99)
    99
    >>> gcd(10, -15)
    5
    """

Python can optionally verify these examples with the doctest package.

  1. Always place a space after the prompt (“>>>”).
  2. When a result is a string, delimit the expected text with single quotes.
  3. Execute doctests as part of the script’s main suite of statements:
if __name__ == "__main__":
    import doctest
    doctest.testmod()

Rule 5.5: Use correct spelling, punctuation, and grammar to improve comment readability.

Documentation is provided for humans to read. You can help them by writing informative documentation in language they can readily understand.

def randWord(self):
    """Pick a random word from the dictionary."""
    return random.choice(self.wordList)
def meanLength(wordList):
    """Compute da average....another silly method IMHO"""
    if wordList:
        result = sum([len(word) for word in wordList])/len(wordList)
    else:
        result = 0
    return result

Section 6: Design

This section—about meanings or semantics—gathers together miscellaneous guidelines about choice in design. Python supports many approaches to solving problems. Some decisions are problematic because (1) they lead to code that is hard to maintain, or (2) they lead to code that works but is unnecessarily difficult to understand.

Rule 6.1: Avoid the use of pass.

The use of the word pass (or the equivalent ...) is a do-nothing statement used where code is required. Its use suggests incomplete code or contorted program structure. We will use the pass statement to suggest areas where students should focus, but the expectation is that all pass statements will be replaced by working code.

The use of pass as a stand-in for a branch of an if statement can always be simplified.

Rule 6.2: Manage files with the with statement, if possible.

Files should be closed when you are finished. When writing to a file, failing to close the file has the potential to lead to incomplete writes. Python’s with statement provides a mechanism for opening a file for access by subordinate statements. When the suite is finished, Python will ensure the file is automatically closed.

with open('census.csv') as pop_file:
    pop_rows = csv.reader(popFile)
    towns = [row[0] for row in pop_rows]
dict = open('/usr/share/dict/words')
words = [word.strip() for word in dict]

Rule 6.3: Avoid unnecessary global variables.

While Python functions allow references to global variables, the practice should be avoided.

By contrast, top-level definition of constants (e.g. math.pi) increase readability and maintainability of code.

system_dictionary = '/usr/share/dict/words'
def read_dict(dict_name = system_dictionary):
    with open(dict_name) as dict_file:
        words = [word.strip() for word in dict_file]
    return words
base = 2                # ok: a constant
converted_number = ''   # not ok: will change
def binary(n):
    global converted_number
    if n == 0:
        converted_umber = '0'
    else:
        while n:
            digit = '0' if n%base == 0 else '1'
            converted_number = digit + converted_number
            n //= base

Rule 6.4: Imports from separate modules should appear as separate imports.

While Python allows symbols to be imported from separate modules in a single import, the practice should be avoided. Import symbols from distinct modules using distinct import statements. This makes the dependencies of a program more understandable and improves maintainability of code.

from math import pi, e            # natural math constants
import matplotlib.pyplot as plt   # for ploting data in polar coordinates
import csv, random, doctest      # random utilities we might need

Rule 6.5: Return function values once, as the last statement.

Functions must always return values. It is always possible to include a single return statement at the end of the function. Best practice avoids the use of multiple returns when possible. Because every path through a function must produce a value, having a return only at the end guarantees that behavior. In general, the multiple points of return may (1) make it more difficult to reason about function behavior and (2) make it difficult to maintain the such behavior as the code matures.

def max_default(nums, default = 0.0):
    """Compute maximum of nums, or default if empty nums."""
    mean = default
    if nums:
        mean = max(nums)
    return mean
def add(vals):
    """Compute sum of vals."""
    if vals:
        return vals[0] + add(vals[1:])
    return 0

Note that this is only a rule of thumb. In some cases (e.g. recursion), multiple points of return can improve readability:

def hex(n):
    """Convert a value to a hexadecimal string."""
    if n < 16:
        return "0123456789abcdef"[n]    # single digits
    else:
        return hex(n//16)+hex(n%16)     # many digits

Rule 6.6: Only use comprehensions when they are short and simple.

Python provides compact mechanisms for building containers, called comprehensions. These are idiomatic forms that replace simple, common for statements. When the comprehension approaches the length of a line is ceases to be easily understood. In those cases, use the for statement.

# compute average word length in (nontrivial) story
with open('story.txt') as f:
    lines = [line for line in f]
# gather a list of words
words = []
for line in lines:
    words.extend(line.split())
mean = sum([len(word) for word in words])/len(words)
# extract palindromes from 'story.tex'
with open('story.txt') as f:
    pals = {word for line in f for word in line.split() if word == word[::-1]}

Similarly, generator expressions should be kept as simple as possible.

Rule 6.8: Use only the simplest conditional expressions. Never nest them.

A conditional expression replaces a simple if statement whose branches select between related values of the same type. Use them only in the simplest cases, and never nest them.

print('Tell us {} {}!!'.format(n, 'story' if n == 1 else 'stories'))
# np is the number of people standing in line
print('I see '+('no one' if np == 0
                else 'a person' if np == 1
                else 'a couple' if np == 2
                else 'some people')+' waiting for tickets.')

Rule 6.7: Limit use of lambda functions to simple, use-once situations.

Lambda functions are nameless definitions that should be used to describe simple, single-use, single parameter functions. Typically, they are used to map values to sorting keys. Their anonymity and abbreviated syntax can obscure the meaning of important code.

# sort cities by decreasing population
cities.sort(key=lambda city: city.census, reverse=True)
# compute product of numbers
prod = lambda *args : 1 if not args else args[0] * prod(*args[1:])

Rule 6.9: Use only immutable values for default arguments.

Function arguments with default values can reduce redundancy in code and improve code reuse. It is important, though, to make sure the default value is immutable. If the default is mutable, the default value can be changed.

def include(key, value, table = None):  # None is immutable
    if table is None:
        table = dict()   # this default value can (and will) be changed
    table[key] = value
    return table
def include(key, value, table = dict()):  # one dict() is created at def
    table[key] = value
    return table

Rule 6.10: Use implicit boolean interpretations of values.

“Zero-like” values: numeric versions of zero or containers holding zero values are interpreted as False, and all others an interpreted as True. Use this fact to meaningfully simplify conditional expressions.

# l is a list to be destructively shuffled
shuffled = []
while l:
    selected = randint(0,len(l)-1)
    shuffled.append(l.pop(selected))
while l != []:
    first,*rest = l
    process(first)
    l = rest

(Though remember Rule 6.12: When checking for an object reference, use is or is not None.)

Rule 6.11: Avoid the use of True and False in boolean computations.

Expressions involving explicit boolean constants (True and False) can always be improved. Simpler expressions are more readable.

while ball_went_foul:
    pitch_to_batter()
    ...
if ball_caught == False:  # more readable: if not ball_caught:
    wait_to_run()

Rule 6.12: When testing validity of object references, use is None.

In Python, the unique value None means no object. Variables that currently do not reference any object—sometimes referred to as a null-reference—have value None. Functions that do not return a value, return None. To test for a value use is not None. Similarly is None checks for a lack of value. While None is interpreted as the boolean False, so are other reasonable referenced values.

def combine(a_stream, b_stream, combiner = None):
    if combiner is None:                    # idiom for complex default arg
        combiner = lambda a,b : a+b
    while a_stream and b_stream:
        yield combiner(a.pop(0), b.pop(0))
def __init__(self, name, alergies):
    if not alergies:                        # wrong: use 'alergies is None'
        self.reactions = "<Unknown>"

Rule 6.13: Distinguish script and module behaviors.

Python scripts often mature into small collections of tools. Avoid executing script-only code during import with the Python idiom:

# end of script definitions
if __name__ == "__main__":
    # beginning of script execution