Written by Duane A. Bailey, and updated by Jeannie Albrecht, Lida Doret, Kelly Shaw, Shikha Singh, Stephen Freund, and Bill Jannen.
Implementations of algorithms—programs—can be beautiful. We make programs beautiful because they will be appreciated, later, by some reader. That may be you. If you’re a tool-builder, users will view your code by what they read. Good programming languages make it easier to cast our abstract ideas as programs, but they also form an effective medium for human communication. Python is no exception.
Part of what makes a beautiful algorithm implementation is just its ability to leverage the syntax of the programming language to convey meaning. When an algorithm is written well, the language’s presentation facilitates, rather than impedes understanding. Part of the process of learning a language is to learn the common practices and coding idioms. Programming is a social discourse and its idioms efficiently convey not only program meaning, but programmer intention.
This document is about coding practices—“pythonic” approaches—to writing code that facilitates understanding. These are cast as a series of rules, to focus the attention. Straying from these guidelines can have the effect of making your code harder for the reader to understand.
Still, Orwell might have proposed: break any rule should it lead to a design not worth defending.
Python version 2 does not include features we depend on to write scalable and correct programs.
This section discusses how we name things or objects in Python. Often, the name given to a value is our only hint at its meaning. Good names make good programs.
This restriction makes your programs more readable and maintainable. In fact, Python won’t allow you to use names involving some special characters.
pi
rstrip
reader atan2
-1
x2-y
&
a* b
for
and with
expressions.Thoughtful naming conveys meaning.
answer
position
opponent word_list
a
p
o aFile
Separate successive words within a name with the underscore character, e.g., _
. A consistent and compact approach makes multiword identifiers more readable and understandable.
mascot_image
purple_cow_motto disk_read
inputFile diskread
Underscores can legally appear anywhere within names, but they can convey specific information. Underscores should be avoided at the beginning of generic variable names unless intentionally used to convey information about the variable’s usage. The lone underscore (_
) is reserved to mean “this value is unimportant”.
__name__ # conveys specific meaning about variable __name__
# conveys the fact that x is a private member variable
p._x # unimportant value _
_life_as_usual yourAnswerIs____
By naming constant variables using only capital letters, it makes it easy to visually distinguish constants from variables whose values are subject to change. This visual cue decreases the likelihood of unintended modification of constant variables’ values.
= 3.14
PI = 12 MONTHS
= 3.14
pi = 12 Months
Spaces, tabs, and end-of-line marks (collectively called, whitespace) are used, as in writing, to delimit words and phrases. In Python, whitespace also plays an important role in structuring programs.
Python uses consistent indentation to identify suites of related statements. Python treats tab characters as subtly different than spaces. Some editors may attempt to substitute several spaces with a single tab. This worries Python, which will warn of inconsistent indentation.
indent-tabs-mode
to nil
to ensure correct behavior.Blank lines help the reader identify the start of a new definition. Unused lines can also break longer suites of statements into logical parts. Multiple, adjacent blank lines, however, tend to make definitions appear scattered.
Spaces and punctuation in programs serve a similar purpose to punctuation in human language. Use single spaces to separate conceptual units of the program, but do not use spaces where you would not find them in a human language. Some feel that using spaces around operators in expressions can make the “mathematics” clearer. Your choice to use spaces to emphasize operators—or not—should be consistent throughout your code.
def fibo(n, a=1, b=1):
"""Compute the n-th value in the Fibonacci sequence:
fibo(0) = a, fibo(1) = b, fibo(n) = fibo(n-2)+fibo(n-1)."""
for _ in range(n):
= b, a+b
a, b return a
def gcd(a,b) :
""" Compute the greatest common integer divisor of a and b. """
if a> b:
= gcd( b , a)
result elif a ==0 :
= b
result else
=gcd(b%a,a)
resultreturn result
Python prefers one statement per line. While semicolons (;
) can be used to separate statements, it tends to make it appear statements are to be executed concurrently, when, in fact, they are executed sequentially. Avoid semicolons.
= a//b
ratio = a%b remainder
= b; b = a # *not* the same as a,b = b,a a
Part of Python’s feeling of expressiveness is its succinctness. Still, succinctness can lead to a feeling of density; we need to be careful to make sure our Python statements are manageable, consumable chunks of logic.
Because Python’s statements are line-oriented, it is important to make sure that readers are not confused by long lines wrapped by editors or printers. Either break up long lines into several shorter statements or manually break lines to make reading more straightforward.
’hello’
) or quotations ("hello"
) around strings.Python has two alternative ways to delimit strings: single apostrophes (or quotes) and quotation marks. Pick one style of delimiting strings, and use it consistently throughout your code.
print("So, " + name + ", do you want to play again?")
print("The lazy fox jumps over the quick " + dog_color + ' dog.')
Parentheses in expressions (()
) indicate the desired order of evaluation. Wrapping entire expressions in parentheses is typically unnecessary. In particular, if
, while
, return
, and yield
statements do not require parentheses around their expressions. Excessive use of parentheses can impact program readability.
if b*b >= 4*a*c:
= (b*b - 4*a*c)**0.5
radical else:
print("Roots are imaginary.")
def print_binary(n):
if (n < 2):
print((n%2))
else:
//2))
print_binary((n%2)) print_binary((n
These rules govern how we document programs. Documentation helps readers understand what code does and how it’s used. In Python, good documentation can support testing.
When the statement in a function, class, or module is an unused string, Python interprets that string as documentation, called a docstring. Python documentation is available through the pydoc3
command. Try, for example,
pydoc3 math
Inside the interpreter, documentation about an object is easily retrieved using the help
method:
>>> help(print)
Programmers who build useful tools describe how they are used.
One important exception is the documentation of the short, one-off lambda functions. These functions are short enough to be easily understood. (If not, they should be declared using def
.)
def randint(low, high):
"""Pick a random integer between low and high (inclusive)."""
return low + abs(random()) % (high-low+1) # includes high
Documentation for definitions of identifiers that begin with underscores are not implicitly volunteered by pydoc3
. Programmers should still provide docstrings for these definitions because (1) their documentation may still be explicitly requested and (2) making a definition public, later, should not expose undocumented features.
"""
) to delimit docstrings.Because docstrings typically span multiple lines, it is universal practice to use three quotation marks ("""
) to delimit this text, even if it is currently a single line.
Comments beginning with the hash mark (#
) describe code on a particular line, or, if standalone, on lines that follow. Where code is necessarily obscure or complex, it is helpful to provide a comment to facilitate the reader’s understanding. Comments that appear on a line with code describe the line. Comments that appear on dedicated lines describe the code that follows.
# compute the real roots of the polynomial a*x**2+b*x+c:
= (b*b-4*a*c)**0.5 # using multiplication to compute b**2
radical = (-b - radical)/(2*a) # left root
root1 = (-b + radical)/(2*a) # right root root2
# Is "year" a leap year?
= (year % 4) == 0
mult4 # year is divisible by 4
= (year % 100) == 0
century = (year % 400) == 0
mult400 = mult4 and ((not century) or mult400)
isLeap # ie. a year is a leap year if it is divisible
# by 4, typically. years divisible by 100
# are not leap years, UNLESS they are divisible by 400
# e.g. 2000 was a leap year, but 2100 won't be.
Python supports a wide variety of testing strategies, including unit tests. The best docstrings include examples of function or method use.
def gcd(a, b):
"""Compute the greatest common divisor of integers a and b.
>>> gcd(0, 0)
0
>>> gcd(0, 99)
99
>>> gcd(10, -15)
5
"""
Python can optionally verify these examples with the doctest
package.
>>>
”).if __name__ == "__main__":
import doctest
doctest.testmod()
Documentation is provided for humans to read. You can help them by writing informative documentation in language they can readily understand.
def randWord(self):
"""Pick a random word from the dictionary."""
return random.choice(self.wordList)
def meanLength(wordList):
"""Compute da average....another silly method IMHO"""
if wordList:
= sum([len(word) for word in wordList])/len(wordList)
result else:
= 0
result return result
This section—about meanings or semantics—gathers together miscellaneous guidelines about choice in design. Python supports many approaches to solving problems. Some decisions are problematic because (1) they lead to code that is hard to maintain, or (2) they lead to code that works but is unnecessarily difficult to understand.
pass
.The use of the word pass
(or the equivalent ...
) is a do-nothing statement used where code is required. Its use suggests incomplete code or contorted program structure. We will use the pass
statement to suggest areas where students should focus, but the expectation is that all pass
statements will be replaced by working code.
The use of pass
as a stand-in for a branch of an if
statement can always be simplified.
with
statement, if possible.Files should be closed when you are finished. When writing to a file, failing to close the file has the potential to lead to incomplete writes. Python’s with
statement provides a mechanism for opening a file for access by subordinate statements. When the suite is finished, Python will ensure the file is automatically closed.
with open('census.csv') as pop_file:
= csv.reader(popFile)
pop_rows = [row[0] for row in pop_rows] towns
dict = open('/usr/share/dict/words')
= [word.strip() for word in dict] words
While Python functions allow references to global variables, the practice should be avoided.
By contrast, top-level definition of constants (e.g. math.pi
) increase readability and maintainability of code.
= '/usr/share/dict/words'
system_dictionary def read_dict(dict_name = system_dictionary):
with open(dict_name) as dict_file:
= [word.strip() for word in dict_file]
words return words
= 2 # ok: a constant
base = '' # not ok: will change
converted_number def binary(n):
global converted_number
if n == 0:
= '0'
converted_umber else:
while n:
= '0' if n%base == 0 else '1'
digit = digit + converted_number
converted_number //= base n
While Python allows symbols to be imported from separate modules in a single import
, the practice should be avoided. Import symbols from distinct modules using distinct import statements. This makes the dependencies of a program more understandable and improves maintainability of code.
from math import pi, e # natural math constants
import matplotlib.pyplot as plt # for ploting data in polar coordinates
import csv, random, doctest # random utilities we might need
Functions must always return values. It is always possible to include a single return
statement at the end of the function. Best practice avoids the use of multiple returns when possible. Because every path through a function must produce a value, having a return only at the end guarantees that behavior. In general, the multiple points of return may (1) make it more difficult to reason about function behavior and (2) make it difficult to maintain the such behavior as the code matures.
def max_default(nums, default = 0.0):
"""Compute maximum of nums, or default if empty nums."""
= default
mean if nums:
= max(nums)
mean return mean
def add(vals):
"""Compute sum of vals."""
if vals:
return vals[0] + add(vals[1:])
return 0
Note that this is only a rule of thumb. In some cases (e.g. recursion), multiple points of return can improve readability:
def hex(n):
"""Convert a value to a hexadecimal string."""
if n < 16:
return "0123456789abcdef"[n] # single digits
else:
return hex(n//16)+hex(n%16) # many digits
Python provides compact mechanisms for building containers, called comprehensions. These are idiomatic forms that replace simple, common for
statements. When the comprehension approaches the length of a line is ceases to be easily understood. In those cases, use the for
statement.
# compute average word length in (nontrivial) story
with open('story.txt') as f:
= [line for line in f]
lines # gather a list of words
= []
words for line in lines:
words.extend(line.split())= sum([len(word) for word in words])/len(words) mean
# extract palindromes from 'story.tex'
with open('story.txt') as f:
= {word for line in f for word in line.split() if word == word[::-1]} pals
Similarly, generator expressions should be kept as simple as possible.
A conditional expression replaces a simple if
statement whose branches select between related values of the same type. Use them only in the simplest cases, and never nest them.
print('Tell us {} {}!!'.format(n, 'story' if n == 1 else 'stories'))
# np is the number of people standing in line
print('I see '+('no one' if np == 0
else 'a person' if np == 1
else 'a couple' if np == 2
else 'some people')+' waiting for tickets.')
lambda
functions to simple, use-once situations.Lambda functions are nameless definitions that should be used to describe simple, single-use, single parameter functions. Typically, they are used to map values to sorting keys. Their anonymity and abbreviated syntax can obscure the meaning of important code.
# sort cities by decreasing population
=lambda city: city.census, reverse=True) cities.sort(key
# compute product of numbers
= lambda *args : 1 if not args else args[0] * prod(*args[1:]) prod
Function arguments with default values can reduce redundancy in code and improve code reuse. It is important, though, to make sure the default value is immutable. If the default is mutable, the default value can be changed.
def include(key, value, table = None): # None is immutable
if table is None:
= dict() # this default value can (and will) be changed
table = value
table[key] return table
def include(key, value, table = dict()): # one dict() is created at def
= value
table[key] return table
“Zero-like” values: numeric versions of zero or containers holding zero values are interpreted as False
, and all others an interpreted as True
. Use this fact to meaningfully simplify conditional expressions.
# l is a list to be destructively shuffled
= []
shuffled while l:
= randint(0,len(l)-1)
selected shuffled.append(l.pop(selected))
while l != []:
*rest = l
first,
process(first)= rest l
(Though remember Rule 6.12: When checking for an object reference, use is
or is not None
.)
True
and False
in boolean computations.Expressions involving explicit boolean constants (True
and False
) can always be improved. Simpler expressions are more readable.
while ball_went_foul:
pitch_to_batter() ...
if ball_caught == False: # more readable: if not ball_caught:
wait_to_run()
is None
.In Python, the unique value None
means no object. Variables that currently do not reference any object—sometimes referred to as a null-reference—have value None
. Functions that do not return a value, return None
. To test for a value use is not None
. Similarly is None
checks for a lack of value. While None
is interpreted as the boolean False
, so are other reasonable referenced values.
def combine(a_stream, b_stream, combiner = None):
if combiner is None: # idiom for complex default arg
= lambda a,b : a+b
combiner while a_stream and b_stream:
yield combiner(a.pop(0), b.pop(0))
def __init__(self, name, alergies):
if not alergies: # wrong: use 'alergies is None'
self.reactions = "<Unknown>"
Python scripts often mature into small collections of tools. Avoid executing script-only code during import with the Python idiom:
# end of script definitions
if __name__ == "__main__":
# beginning of script execution