Python Library Reference#

  • In the left column, all names refer to functions, methods, examples of arguments to a function or method, or examples of an object we might call the method on. For example, tbl refers to a table, array refers to an array, str is a string, and num refers to a number. array.item(0) is an example call for the method item, and in that example, array is the name previously given to some array.

  • Click on any row to see detailed examples.

Library Sections

Table Functions and Methods

Name Ch Description Parameters Output

Table()

6

Create an empty table, usually to extend with data

None

An empty Table

Examples
table = Table()

Table().read_table(filename)

6

Create a table from a data file

string: the name of the file

Table with the contents of the data file

Examples
table = Table.read_table('https://raw.githubusercontent.com/cs104williams/assets/main/trees.csv')
table
Genus Species Common Name Count
Acer pensylvanicum Striped maple 24
Acer rubrum Red maple 20
Acer saccharum Sugar maple 2
Betula alleghaniensis yellow birch 7
Betula lenta Sweet birch 2
Betula papyrifera Paper birch 2
Fagus grandifolia American beech 125
Quercus rubra Northern red oak 1

tbl.with_columns(column, values) tbl.with_columns(n1, v1, n2, v2,
                 ...)

6

A table with an additional or replaced column or columns. column is the name of a column, values is an array

  1. string: the name of the new column

  2. array: the values in that column.

Table: a copy of the original Table with the new columns added

Examples
t = Table().with_columns(
  'letter', make_array('a', 'b', 'c', 'z'),
  'points', make_array(  1,   2,   2,  10),
)
t
letter points
a 1
b 2
c 2
z 10
counts = make_array(9, 3, 3, 1)
t2 = t.with_columns(
  'count', counts
)
t2
letter points count
a 1 9
b 2 3
c 2 3
z 10 1
t3 = t.with_columns('doubled', t.column('points') * 2)
t3
letter points doubled
a 1 2
b 2 4
c 2 4
z 10 20
t4 = t2.with_columns('total', t2.column('points') * t2.column('count'))
t4
letter points count total
a 1 9 9
b 2 3 6
c 2 3 6
z 10 1 10

tbl.column(column)

6

The values of a column (an array)

string: the column name

array: the values in that column

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.column('count')
array([9, 3, 3, 1])

tbl.num_rows

6

Compute the number of rows in a table

None

int: the number of rows in the table

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.num_rows
4

tbl.num_columns

6

Compute the number of columns in a table

None

int: the number of columns in the table

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.num_columns
3

tbl.labels

6

Lists the column labels in a table

None

array: the names of each column (as strings) in the table

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.labels
('letter', 'count', 'points')

tbl.select(col1, col2, ...)

6

Create a copy of a table with only some of the columns.

string: column name(s)

Table with the selected columns

Examples
flowers
Petals Name Weight
8 lotus 10
34 sunflower 5
5 rose 6
flowers.select('Petals', 'Weight')
Petals Weight
8 10
34 5
5 6

tbl.drop(col1, col2, ...)

6

Create a copy of a table without some of the columns.

string: column name(s)

Table without the selected columns

Examples
flowers
Petals Name Weight
8 lotus 10
34 sunflower 5
5 rose 6
flowers.drop('Petals', 'Weight')
Name
lotus
sunflower
rose

tbl.relabeled(old_label, new_label)

6

Creates a new table, changing the column name specified by the old label to the new label, and leaves the original table unchanged.

  1. string: the old column name

  2. string: the new column name

Table: a new Table

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.relabeled('count', 'number')
letter number points
a 9 1
b 3 2
c 3 2
z 1 10

tbl.show(n)

6.1

Display n rows of a table. If no argument is specified, defaults to displaying the entire table.

(Optional) int: number of rows you want to display

None: displays a table with n rows

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.show()
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.show(2)
letter count points
a 9 1
b 3 2

... (2 rows omitted)

tbl.sort(column)

6.1

Create a copy of a table sorted by the values in a column. There are two optional parameters:

  • Defaults to ascending order unless descending = True is included.

  • Including distinct=True yields a table with only first row containing each value in the column.

  1. string: column

  2. (Optional) descending = True

  3. (Optional) distinct = True

Table: a copy of the original with the column sorted

Examples
flowers.sort('Petals')
Petals Name Weight
5 rose 6
8 lotus 10
34 sunflower 5
flowers.sort('Name', descending=True)
Petals Name Weight
34 sunflower 5
5 rose 6
8 lotus 10
marbles
Color Shape Amount Price
Red Round 4 1.3
Green Rectangular 6 1.2
Blue Rectangular 12 2
Red Round 7 1.75
Green Rectangular 9 0
Green Round 2 3
marbles.sort("Color", distinct=True)
Color Shape Amount Price
Blue Rectangular 12 2
Green Rectangular 6 1.2
Red Round 4 1.3

tbl.where(column, predicate)

6.2

Create a copy of a table with only the rows that match some predicate . See Table.where Predicates below.

  1. string: column name

  2. are.(...) predicate

Table: a copy of the original table with only the rows that match the predicate

Examples
marbles
Color Shape Amount Price
Red Round 4 1.3
Green Rectangular 6 1.2
Blue Rectangular 12 2
Red Round 7 1.75
Green Rectangular 9 0
Green Round 2 3
marbles.where("Price", 1.3)
Color Shape Amount Price
Red Round 4 1.3
marbles.where("Price", are.equal_to(1.3))
Color Shape Amount Price
Red Round 4 1.3
marbles.where("Price", are.above(1.3))
Color Shape Amount Price
Blue Rectangular 12 2
Red Round 7 1.75
Green Round 2 3

tbl.take(row_indices)

6.2

A table with only the rows at the given indices. row_indices is either an array of indices or an integer corresponding to one index.

array of ints: the indices of the rows to be included in the Table OR int: the index of the row to be included

Table: a copy of the original table with only the rows at the given indices

Examples
grades
letter grade gpa
A+ 4
A 4
A- 3.7
B+ 3.3
B 3
B- 2.7
grades.take(0)
letter grade gpa
A+ 4
grades.take(np.arange(0,3))
letter grade gpa
A+ 4
A 4
A- 3.7

tbl.take_clean(column, type)
tbl.take_clean(n1, t1, n2, t2,
               ...)

A table with only the rows with valid values of the appropriate type for each specified column.

  1. string: the name of a column in the table.

  2. type: the type of values stored in the column (int, float, or str).

Table: a copy of the original Table containing only the rows that match the type criteria for the specified columns.

Examples
messy_tiles
letter count points
a 9 1
b nan 2
c 3 3.2
z one 10
messy_tiles.take_clean('count', int)
letter count points
a 9 1
c 3 3.2
messy_tiles.take_clean('count', int, 
                       'points', int)
letter count points
a 9 1

tbl.take_messy(column, type)
tbl.take_messy(n1, t1, n2, t2,
               ...)

A table with only the rows with invalid values of the appropriate type in one or more of the specified columns.

  1. string: the name of a column in the table.

  2. type: the type of values stored in the column (int, float, or str).

Table: a copy of the original Table containing only the rows that do not match the type criteria for the specified columns.

Examples
messy_tiles
letter count points
a 9 1
b nan 2
c 3 3.2
z one 10
messy_tiles.take_messy('count', int)
letter count points
b nan 2
z one 10
messy_tiles.take_messy('count', int, 
                       'points', int)
letter count points
b nan 2
z one 10
c 3 3.2

tbl.replace(column,
            old, new)

Create a copy of the table where all values equal to old in the given column are replaced with the new value.

  1. col: string: column

  2. old: the value to replace

  3. new: the new value

Table: a copy of the original Table with the old value replaced with new.

Examples
marbles
Color Shape Amount Price
Red Round 4 1.3
Green Rectangular 6 1.2
Blue Rectangular 12 2
Red Round 7 1.75
Green Rectangular 9 0
Green Round 2 3
marbles.replace('Color', 'Red', 'Purple')
Color Shape Amount Price
Purple Round 4 1.3
Green Rectangular 6 1.2
Blue Rectangular 12 2
Purple Round 7 1.75
Green Rectangular 9 0
Green Round 2 3
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.replace('count', 3, 1)
letter count points
a 9 1
b 1 2
c 1 2
z 1 10

tbl.scatter(x_column, y_column)

7

Draws a scatter plot consisting of one point for each row of the table. Note that
x_column and y_column must be strings specifying column names. Include optional argument
fit_line=True if you want to draw a line of best fit for each set of points.

  1. string: name of the column on the x-axis

  2. string: name of the column on the y-axis

  3. (Optional) fit_line=True

None: draws a scatter plot

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.scatter('count', 'points')
../_images/python-library-ref_127_0.png
tiles.scatter('count', 'points', 
              fit_line=True)
../_images/python-library-ref_129_0.png

tbl.plot(x_column, y_column)      
tbl.plot(x_column)

7

Draw a line graph consisting of one point for each row of the table. If you only specify one column, it will plot the rest of the columns on the y-axis as different colored lines.

  1. string: name of the column on the x-axis

  2. string: name of the column on the y-axis

None: draws a line graph

Examples
projections
days price projection
0 90.5 90.75
1 90 82
2 83 82.5
3 95.5 82.5
4 82 83
5 82 82.5
projections.plot('days')
../_images/python-library-ref_135_0.png
projections.plot('days', 'price')
../_images/python-library-ref_137_0.png

tbl.barh(categories)      
tbl.barh(categories, values)

7.1

Displays a bar chart with bars for each category in a column, with height proportional to the corresponding frequency. values argument unnecessary if table has only a column of categories and a column of values.

  1. string: name of the column with categories

  2. (Optional) string: the name of the column with values for corresponding categories

None: draws a bar chart

Examples
flowers
Petals Name Weight
8 lotus 10
34 sunflower 5
5 rose 6
flowers.barh('Name')
../_images/python-library-ref_143_0.png
flowers.barh('Name', 'Weight')
../_images/python-library-ref_145_0.png

tbl.hist(column, unit, bins)

7.2

Generates a histogram of the numerical values in a column. unit and bins are optional arguments, used to label the axes and group the values into intervals (bins), respectively. Bins have the form [a, b) , where a is included in the bin and b is not.

  1. string: name of the column with categories

  2. (Optional) string: units of x-axis

  3. (Optional) array of ints/floats denoting bin boundaries

None: draws a histogram

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.hist('count', 
           bins=make_array(0,2,4,6,8,10))
../_images/python-library-ref_151_0.png
tiles.hist('count', 
           bins=make_array(0,5,10))
../_images/python-library-ref_153_0.png

tbl.bin(column)      
tbl.bin(column, bins)

7.2

Groups values into intervals, known as bins. Results in a two-column table that contains the number of rows in each bin. The first column lists the left endpoints of the bins, except in the last row. If the bins argument isn’t used, default is to produce 10 equally wide bins between the min and max values of the data.

  1. string: column name

  2. (Optional) array of ints/floats denoting bin boundaries or an int of the number of bins you want

Table: a new Table

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.bin('points')
bin points count
1 1
1.9 2
2.8 0
3.7 0
4.6 0
5.5 0
6.4 0
7.3 0
8.2 0
9.1 1

... (1 rows omitted)

tiles.bin('points', 
          bins=make_array(0,5,10,15))
bin points count
0 3
5 0
10 1
15 0

tbl.apply(function, col1, col2, ...)

8.1

Returns an array of values resulting from applying a function to each item in a column.

  1. function: function to apply to column

  2. (Optional) string: name of the column to apply function to. You may have multiple columns, in which case the respective column’s values for each row will be passed as the corresponding argument to the function. If there is no argument, your function will be applied to every row object in tbl

array: contains an element for each value in the original column after applying the function to it

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
def inc(x):
  return x + 1

tiles.apply(inc, 'count')
array([10,  4,  4,  2])
def mult(x, y):
  return x * y

tiles.apply(mult, 'count', 'points')
array([ 9,  6,  6, 10])
def ratio(row):
  return row[2] / row[1]

tiles.apply(ratio)
array([ 0.11111111,  0.66666667,  0.66666667, 10.        ])

tbl.group(column_or_columns, func)

8.2

Group rows by unique values or combinations of values in a column(s). Multiple columns must be entered in array or list form. Other values aggregated by count (default) or optional argument func.

  1. string or array of strings: column(s) on which to group

  2. (Optional) function: function to aggregate values in cells (defaults to count)

Table: a new Table

Examples
marbles
Color Shape Amount Price
Red Round 4 1.3
Green Rectangular 6 1.2
Blue Rectangular 12 2
Red Round 7 1.75
Green Rectangular 9 0
Green Round 2 3
marbles.group('Color')
Color count
Blue 1
Green 3
Red 2
marbles.group('Color', max)
Color Shape max Amount max Price max
Blue Rectangular 12 2
Green Round 9 3
Red Round 7 1.75
marbles.group('Shape', sum)
Shape Color sum Amount sum Price sum
Rectangular 27 3.2
Round 13 6.05

tbl.pivot(col1, col2)
tbl.pivot(col1, col2,
          values, collect)

8.3

A pivot table where each unique value in col1 has its own column and each unique value in col2 has its own row. Count or aggregate values from a third column, collect with some function. Default values and collect return counts in cells.

  1. string: name of column whose unique values will make up columns of pivot table

  2. string: name of column whose unique values will make up rows of pivot table

  3. (Optional) string: name of column that describes the values of cell

  4. (Optional) function: how the values are collected, e.g. sum or np.mean

Table: a new Table

Examples
marbles
Color Shape Amount Price
Red Round 4 1.3
Green Rectangular 6 1.2
Blue Rectangular 12 2
Red Round 7 1.75
Green Rectangular 9 0
Green Round 2 3
marbles.pivot('Color', 'Shape')
Shape Blue Green Red
Rectangular 1 2 0
Round 0 1 2
marbles.pivot('Color', 'Shape', 
              values='Price', collect=np.mean)
Shape Blue Green Red
Rectangular 2 0.6 0
Round 0 3 1.525
marbles.pivot('Color', 'Shape', 
              values='Price', collect=max)
Shape Blue Green Red
Rectangular 2 1.2 0
Round 0 3 1.75

tblA.join(colA, tblB)
tblA.join(colA, tblB, colB)

8.4

Generate a table with the columns of tblA and tblB, containing rows for all values of a column that appear in both tables. Default colB is colA. colA and colB must be strings specifying column names.

  1. string: name of column in tblA with values to join on

  2. Table: other Table

  3. (Optional) string: if column names are different between Tables, the name of the shared column in tblB

Table: a new Table

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
played
letter on board
a 4
b 3
c 2
z 1
tiles.join('letter', played)
letter count points on board
a 9 1 4
b 3 2 3
c 3 2 2
z 1 10 1
nums1
A B
9 1
3 2
3 2
1 10
nums2
A C
9 1
1 2
1 2
1 10
nums1.join('A', nums2)
A B C
1 10 2
1 10 2
1 10 10
9 1 1
nums1
A B
9 1
3 2
3 2
1 10
nums2
A C
9 1
1 2
1 2
1 10
nums1.join('A', nums2, 'C')
A B A_2
1 10 9

tbl.sample(n)      
tbl.sample(n, with_replacement)

10

A new table where n rows are randomly sampled from the original table; by default, n=tbl.num_rows. Default is with replacement. For sampling without replacement, use argument with_replacement=False. For a non-uniform sample, provide a third argument weights=distribution where distribution is an array or list containing the probability of each row.

  1. int: sample size

  2. (Optional) with_replacement=True

Table: a new Table with n rows

Examples
jobs
job wage
a 10
b 20
c 15
d 8
jobs.sample()
job wage
c 15
c 15
d 8
c 15
jobs.sample(k=2)
job wage
b 20
d 8
jobs.sample(k=4, 
            with_replacement=False) 
job wage
d 8
a 10
c 15
b 20
jobs.sample(k=8, 
            with_replacement=True) 
job wage
a 10
d 8
d 8
c 15
c 15
b 20
a 10
a 10
dist = make_array(0.9, 0.1, 0, 0)
jobs.sample(k=8, weights=dist)
job wage
a 10
a 10
a 10
a 10
a 10
a 10
a 10
a 10

tbl.row(row_index)

17.3

Accesses the row of a table by taking the index of the row as its argument. Note that rows are in general not arrays, as their elements can be of different types. However, you can use .item to access a particular element of a row using row.item(label).

int: row index

Row object with the values of the row and labels of the corresponding columns

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.row(1)
Row(letter='b', count=3, points=2)
tiles.row(1).item('letter')
'b'

tbl.rows

N/A

Can use to access all of the rows of a table.

None

Rows object made up of all rows as individual row objects

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.rows
Rows(letter | count | points
a      | 9     | 1
b      | 3     | 2
c      | 3     | 2
z      | 1     | 10)
tiles.rows[1].item('count')
3

String Methods

Name Ch Description Parameters Output

str.split(separator)

Splits the string (str) into a list based on the separator that is passed in

string: the text used to separate parts of the string

A list of strings

Examples
text
'a,b,c,b,e'
text.split(",")
['a', 'b', 'c', 'b', 'e']
text.split("b")
['a,', ',c,', ',e']

str.join(array)

Combines each element of array into one string, with str being in-between each element

array: the array to turn into a string

A new string

Examples
letters
array(['a', 'b', 'c', 'd'], dtype='<U1')
','.join(letters)
'a,b,c,d'
' and '.join(letters)
'a and b and c and d'

str.replace(old, new)

4.2.1

Replaces each occurrence of old in str with the value of new

  1. string: The text to replace.

  2. string: The next text.

A new string

Examples
text
'yay! data science'
text.replace('data', 'computer')
'yay! computer science'
text.replace('a', 'aaa')
'yaaay! daaataaa science'

Array Functions and Methods

In the examples in the left column, np refers to the NumPy module, as usual.

Name Ch Description Parameters Output

make_array(v1, v2, ...)

5

Makes a numpy array with the values passed in

The values to use

A new array

Examples
a = make_array(1,2,3,4)
a
array([1, 2, 3, 4])
a = make_array("A", "B")
a
array(['A', 'B'], dtype='<U1')
a = make_array(1.1, 2.2, 3.3)
a
array([1.1, 2.2, 3.3])

max(array)

3.3

Returns the maximum value of an array

array: the values to use

An element from array.

Examples
odds
array([1, 3, 5, 7])
max(odds)
7

min(array)

3.3

Returns the minimum value of an array

array: the values to use

An element from array.

Examples
odds
array([1, 3, 5, 7])
min(odds)
1

sum(array)

3.3

Returns the sum of the values in an array

array: the values to use

An total of array.

Examples
seq
array([ 1, -1,  2, -2,  3, -3])
sum(seq)
0
seq
array([ 1, -1,  2, -2,  3, -3])
# Can count number of elements meeting a condition too
sum(seq < 0)
3

abs(num)
abs(array)

3.3

Take the absolute value of number or each number in an array.

number or array: the value(s) to use

A number or array

Examples
abs(-2.2)
2.2
seq
array([ 1, -1,  2, -2,  3, -3])
abs(seq)
array([1, 1, 2, 2, 3, 3])

round(num) 
np.round(array)

3.3

Round number or array of numbers to the nearest integer.

number or array: the value(s) to use

A number or array

Examples
round(2.2)
2
reals
array([0.25, 0.5 , 0.75, 1.  ])
np.round(reals)
array([0., 0., 1., 1.])

len(array)

3.3

Returns the length (number of elements) of an array

array: the values to use

An int

Examples
odds
array([1, 3, 5, 7])
len(odds)
4

np.average(array)
np.mean(array)

5.1

Returns the mean value of an array

array: the values to use

A number

Examples
odds
array([1, 3, 5, 7])
np.mean(odds)
4.0

np.diff(array)

5.1

Given an array with contents [ v0, v1, v2, v3, ... ], returns a new array of size len(array)-1 with elements equal to the difference between adjacent elements: [ v1-v0, v2-v1, v3-v2, ... ]

array: the values to use

An array

Examples
odds
array([1, 3, 5, 7])
np.diff(odds)
array([2, 2, 2])
seq
array([ 1, -1,  2, -2,  3, -3])
np.diff(seq)
array([-2,  3, -4,  5, -6])

np.sqrt(array)

5.1

Returns an array with the square root of each element

array: the values to use

A number

Examples
squares
array([ 1,  4,  9, 16, 25])
np.sqrt(squares)
array([1., 2., 3., 4., 5.])

np.arange(start, stop, step) 
np.arange(start, stop)
np.arange(stop)

5.2

An array of numbers starting with start, going up in increments of step, and going up to but excluding stop. When start and/or step are left out, default values are used in their place. Default step is 1; default start is 0.

  1. (Optional) number: The start of the sequence

  2. number: The end of the sequence

  3. (Optional) number: The step of the sequence

A new array

Examples
np.arange(0,3)
array([0, 1, 2])
np.arange(3)
array([0, 1, 2])
np.arange(3.0)
array([0., 1., 2.])
np.arange(3,9)
array([3, 4, 5, 6, 7, 8])
np.arange(3,9,2)
array([3, 5, 7])
np.arange(3,9,3)
array([3, 6])

array.item(index)

5.3

Returns the i-th item in an array (remember Python indices start at 0!)

int: the index of the item

An item from array

Examples
odds
array([1, 3, 5, 7])
odds.item(0)
1
odds.item(2)
5

np.random.choice(array)
np.random.choice(array, n)

9

Picks one (by default) or some number n of items from an array at random with replacement.

  1. array: The values to use

  2. (Optional) int: The number of random choices

A value or array

Examples
odds
array([1, 3, 5, 7])
np.random.choice(odds)
7
np.random.choice(odds, 6)
array([3, 3, 7, 5, 1, 5])

np.count_nonzero(array)

9

Returns the number of non-zero (or True) elements in an array. (We often use sum instead.)

array: The values to use

An int

Examples
euler
array([  1,   0,  -1,   0,   5,   0, -61,   0])
np.count_nonzero(euler)
4
seq
array([ 1, -1,  2, -2,  3, -3])
# Can count number of elements meeting a condition too
np.count_nonzero(seq < 0)
3

np.append(array, item)

9.2

Returns a copy of the input array with item (must be the same type as the other entries in the array) appended to the end.

  1. array: The initial array

  2. value: The value to add

A new array

Examples
odds
array([1, 3, 5, 7])
extended = np.append(odds, 9)
extended
array([1, 3, 5, 7, 9])

percentile(percentile,
           array)

13.1

Returns the corresponding percentile of an array.

  1. number: a percentile between 0 and 100

  2. array: the array to use

A value

Examples
fibs
array([ 1,  1,  2,  3,  5,  8, 13])
percentile(50, fibs)
3
percentile(25, fibs)
1

np.std(array)

14.2

Returns the standard deviation of an array

array: the values to use

A number

Examples
odds
array([1, 3, 5, 7])
np.std(odds)
2.23606797749979

Table.where(...) Predicates

Any of these predicates can be negated by adding not_ in front of them, e.g. are.not_equal_to(Z) or are.not_containing(S).

Name Ch Description      

are.equal_to(Z)

6.2

Equal to Z

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.equal_to(3))
letter count points
b 3 2
c 3 2

are.not_equal_to(Z)

6.2

Not equal to Z

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.not_equal_to(3))
letter count points
a 9 1
z 1 10

are.above(x)

6.2

Greater than x

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.above(3))
letter count points
a 9 1

are.above_or_equal_to(x)

6.2

Greater than or equal to x

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.above_or_equal_to(3))
letter count points
a 9 1
b 3 2
c 3 2

are.below(x)

6.2

Less than x

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.below(3))
letter count points
z 1 10

are.below_or_equal_to(x)

6.2

Less than or equal to x

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.below_or_equal_to(3))
letter count points
b 3 2
c 3 2
z 1 10

are.between(x,y)

6.2

Greater than or equal to x and less than y

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.between(1,9))
letter count points
b 3 2
c 3 2
z 1 10

are.between_or_equal_to(x,y)

6.2

Greater than or equal to x , and less than or equal to y

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.between_or_equal_to(1,9))
letter count points
a 9 1
b 3 2
c 3 2
z 1 10

are.strictly_between(x,y)

6.2

Greater than x and less than y

Examples
tiles
letter count points
a 9 1
b 3 2
c 3 2
z 1 10
tiles.where('count',are.strictly_between(1,9))
letter count points
b 3 2
c 3 2

are.contained_in(A)

6.2

Is a substring of A (if A is a string) or an element of A (if A is a list/array)

Examples
grades
letter grade gpa
A+ 4
A 4
A- 3.7
B+ 3.3
B 3
B- 2.7
grades.where('letter grade', 
  are.contained_in('AB'))
letter grade gpa
A 4
B 3
grades.where('letter grade', 
  are.contained_in(make_array('A', 'B')))
letter grade gpa
A 4
B 3

are.containing(S)

6.2

Contains the string S

Examples
grades
letter grade gpa
A+ 4
A 4
A- 3.7
B+ 3.3
B 3
B- 2.7
grades.where('letter grade', are.containing('A'))
letter grade gpa
A+ 4
A 4
A- 3.7

Other Functions

Name Ch Description Parameters Output

check(condition)

condition may be any boolean expression. It will report a message if the condition is not True. Use this to test your code!

condition : the condition to ensure is true

None: prints a message if the condition is false

Examples
check(1 < 2)
check(2 < 1)
🐝 check(2 < 1)
      2 < 1 is false because
        2 >= 1

sample_proportions(sample_size,        
                   model_proportions)

11.1

Sample_size should be an integer, model_proportions an array of probabilities that sum up to 1. The function samples sample_size objects from the distribution specified by model_proportions. It returns an array with the same size as model_proportions. Each item in the array corresponds to the proportion of times it was sampled out of the sample_size times.

  1. int : sample size

  2. array : an array of proportions that should sum to 1

array : each item corresponds to the proportion of times that corresponding item was sampled from model_proportions in sample_size draws, should sum to 1

Examples
model_proportions = make_array(0.9, 0.1)
sample_proportions(100, model_proportions)
array([0.92, 0.08])
model_proportions = make_array(0.7, 0.2, 0.1)
sample_proportions(100, model_proportions)
array([0.68, 0.24, 0.08])

interact(f, controls)

Create an interactive visualization for the function f using the controls to let the user adjust the parameters to f. Given a function f(x1,x2,...), call interact(f, x1=c1, x2=c2, ...), where each c1, c2, … is:

  • a Fixed(v) to describe a constant value v,

  • a Slider(lo, hi) or Slider(lo,hi,step) to describe a range of possible values, or

  • a Choice(array) to describe a choice among the elements of array.

  • f: the function to interact with.

  • controls: the controls dictating the visualization’s UI.

None

Examples
def plot_genus(genus):
  t = trees.where('Genus', genus)
  t.barh('Common Name', 'Count')
    
genuses = make_array('Acer','Betula')
interact(plot_genus, 
         genus=Choice(genuses))

def length_hist(num_bins):
  return bills.hist('bill_length_mm',
                    bins=num_bins)

interact(length_hist, 
         num_bins=Slider(1,20))

def sum_three(a,b,c):
  print("Sum = ", a + b + c)

interact(sum_three, 
         a=Slider(1,20),
         b=Choice(5,10),
         c=Fixed(100))

minimize(function)

15.4

Returns an array of values such that if each value in the array was passed into function as arguments, it would minimize the output value of function.

function : name of a function that will be minimized.

array : An array in which each element corresponds to an argument that minimizes the output of the function. Values in the array are listed based on the order they are passed into the function; the first element in the array is also going to be the first value passed into the function.

Examples
def polynomial(x):
  return 3 * x ** 2 - 4 * x + 5

minimize(polynomial)
0.6666666640092055
x = make_array([1,2,3,4])
y = make_array([2.9,4.8,7,9.2])

def function(any_slope, any_intercept):
  fitted = any_slope*x + any_intercept
  return np.mean((y - fitted) ** 2)

minimize(function)
array([2.11, 0.7 ])

Plot Customization and Annotation

The table methods hist, plot, scatter, and barh all return a plot that you can use to customize or annotate your figure in various ways using the following methods.

For example, the following code changes the title and adds a red do to plot on the right.

plot = heads_count.hist()
plot.set_title('Number of Heads in 100 Coin Flips')
plot.dot(50)
../_images/python-library-ref_497_0.png
Name    Description      

plot.dot(x)
plot.dot(x, y)

Places a circle marker on a scatter plot, line plot, or histogram. If only x is provided, the marker is centered at (x,0). Otherwise, it is centered at (x,y).

Examples
plot = heads_count.hist()
plot.dot(45)
../_images/python-library-ref_499_0.png
plot = heads_count.hist()
plot.dot(49, color="orange", size=150)              
plot.dot(53, color="purple", size=300)       
../_images/python-library-ref_501_0.png
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.dot(40,18)    
../_images/python-library-ref_503_0.png
plot = projections.plot('days', 'price')
plot.dot(3, 95.5)
../_images/python-library-ref_505_0.png

plot.square(x)
plot.square(x, y)

Places a square marker on a scatter plot, line plot, or histogram. If only x is provided, the marker is centered at (x,0). Otherwise, it is centered at (x,y).

Examples
plot = heads_count.hist()
plot.square(45)
../_images/python-library-ref_507_0.png
plot = heads_count.hist()
plot.square(49, color="orange", size=150)              
plot.square(53, color="purple", size=300)              
../_images/python-library-ref_509_0.png
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.square(40,18)
../_images/python-library-ref_511_0.png
plot = projections.plot('days', 'price')
plot.square(3, 95.5)
../_images/python-library-ref_513_0.png

plot.interval(low, high)
plot.interval(range)

Marks an interval on the x-axis with a yellow bar. Call either low and high values or with a range array containing those values.

Examples
plot = heads_count.hist()
plot.interval(40,60)
../_images/python-library-ref_515_0.png
plot = heads_count.hist()
lo_hi = make_array(45,55)
plot.interval(lo_hi)
../_images/python-library-ref_517_0.png

plot.line(x=v)
plot.line(y=v)
plot.line(x=xs, y=ys)
plot.line(slope=a, intercept=b)

Adds a line to a plot. You may describe the line to draw with several parameters:

  • use x=v to draw an infinite vertical line at the given x-value;

  • use y=v to draw an infinite horizontal line at the given y-value;

  • use x=xs and y=ys, where are xs and ys contain two values, to specify the the x and y values for two end-points of a line segment; or

  • use a slope=a and intercept=b to describe any other infinite line.

Optional arguments color, width, and linestyle can customize the line’s appearance. See the examples for details.

Examples
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.line(x=40)            
../_images/python-library-ref_519_0.png
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.line(y=18)
../_images/python-library-ref_521_0.png
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.line(x=make_array(35,45), y=make_array(18,20),
          color='red')
../_images/python-library-ref_523_0.png
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.line(slope=0.179, intercept=11.409, 
          width=4, color='orange', linestyle="dashed")
../_images/python-library-ref_525_0.png

plot.set_title(label)
plot.set_xlabel(label)
plot.set_ylabel(label)

Set the title or the the x or y axes to a new label.

Examples
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.set_title("Penguin Bills")
../_images/python-library-ref_527_0.png
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.set_xlabel("Length (mm)")
plot.set_ylabel("Depth (mm)")
../_images/python-library-ref_529_0.png

plot.set_xlim(range)
plot.set_ylim(range)

Set the range of values displayed along the x or y axis.

Examples
plot = bills.scatter('bill_length_mm','bill_depth_mm')
lims = make_array(25,50)
plot.set_xlim(lims)
../_images/python-library-ref_531_0.png
plot = bills.scatter('bill_length_mm','bill_depth_mm')
plot.set_ylim(make_array(10,30))    
../_images/python-library-ref_533_0.png

Based on a reference page written by Nishant Kheterpal and Jessica Hu.