Tables¶
The next cell has some “boiler plate” code. You’ll find it at the top of all notebooks in CS 104. Always run this cell. It sets up our notebook environment to have access to the libraries and resources used in the rest of the code.
from datascience import *
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
Python Review¶
Expressions¶
24
24
24 * 7
168
24 * 60 * (60 + 5 - 3 * 2)
84960
# two to the power of four: 2 * 2 * 2 * 2
2 ** 4
16
'hello'
'hello'
Variables¶
Terminology: expressions, values, names, variables, statements
Naming rules: letters, numbers, underscores; case sensitive; start with letter usually
Calculate the number of seconds in a year.
60 * 60 * 24 * 365
31536000
seconds_per_year = 60 * 60 * 24 * 365
seconds_per_year
31536000
seconds_per_hour = 60 * 60
hours_per_year = 24 * 365
seconds_per_year = seconds_per_hour * hours_per_year
seconds_per_year
31536000
Functions¶
Terminology: function, call, argument
abs(-5)
5
day_temp = 52
night_temp = 47
abs(night_temp - day_temp)
5
max(3, 4, 6, -2 , 1, 0)
6
y = max(3, 4)
y
4
round(123.456, 1)
123.5
Tables¶
Tables are stored as CSV (Comma separated values) files. Have a look!
Hopkin’s Forest Tree Surveys¶
Trees and more here: https://hmf.williams.edu/researchacademics/data/
trees = Table.read_table('data/hopkins-plot-0011.csv')
trees
genus | species | common name | count |
---|---|---|---|
Acer | pensylvanicum | Maple, striped | 24 |
Acer | rubrum | Maple, red | 20 |
Acer | saccharum | Maple, sugar | 2 |
Betula | alleghaniensis | Birch, yellow | 7 |
Betula | lenta | Birch, black | 2 |
Betula | papyrifera | Birch, paper | 2 |
Fagus | grandifolia | Beech, American | 125 |
Quercus | rubra | Oak, red | 1 |
trees.show(3)
genus | species | common name | count |
---|---|---|---|
Acer | pensylvanicum | Maple, striped | 24 |
Acer | rubrum | Maple, red | 20 |
Acer | saccharum | Maple, sugar | 2 |
... (5 rows omitted)
trees.show()
genus | species | common name | count |
---|---|---|---|
Acer | pensylvanicum | Maple, striped | 24 |
Acer | rubrum | Maple, red | 20 |
Acer | saccharum | Maple, sugar | 2 |
Betula | alleghaniensis | Birch, yellow | 7 |
Betula | lenta | Birch, black | 2 |
Betula | papyrifera | Birch, paper | 2 |
Fagus | grandifolia | Beech, American | 125 |
Quercus | rubra | Oak, red | 1 |
Table Operations: Selecting Columns¶
Terminology: method
trees.select('genus')
genus |
---|
Acer |
Acer |
Acer |
Betula |
Betula |
Betula |
Fagus |
Quercus |
trees.select('common name', 'count')
common name | count |
---|---|
Maple, striped | 24 |
Maple, red | 20 |
Maple, sugar | 2 |
Birch, yellow | 7 |
Birch, black | 2 |
Birch, paper | 2 |
Beech, American | 125 |
Oak, red | 1 |
# trees.select(genus, 'count') # Error!
trees.drop('species')
genus | common name | count |
---|---|---|
Acer | Maple, striped | 24 |
Acer | Maple, red | 20 |
Acer | Maple, sugar | 2 |
Betula | Birch, yellow | 7 |
Betula | Birch, black | 2 |
Betula | Birch, paper | 2 |
Fagus | Beech, American | 125 |
Quercus | Oak, red | 1 |
trees # Still the same...
genus | species | common name | count |
---|---|---|---|
Acer | pensylvanicum | Maple, striped | 24 |
Acer | rubrum | Maple, red | 20 |
Acer | saccharum | Maple, sugar | 2 |
Betula | alleghaniensis | Birch, yellow | 7 |
Betula | lenta | Birch, black | 2 |
Betula | papyrifera | Birch, paper | 2 |
Fagus | grandifolia | Beech, American | 125 |
Quercus | rubra | Oak, red | 1 |
# must rename the variable if want to save the results of a method applied to the Table
trees_without_species = trees.drop('species')
trees_without_species
genus | common name | count |
---|---|---|
Acer | Maple, striped | 24 |
Acer | Maple, red | 20 |
Acer | Maple, sugar | 2 |
Betula | Birch, yellow | 7 |
Betula | Birch, black | 2 |
Betula | Birch, paper | 2 |
Fagus | Beech, American | 125 |
Quercus | Oak, red | 1 |
Table Operations: Sorting¶
Terminology: named argument
trees_without_species.sort('count')
genus | common name | count |
---|---|---|
Quercus | Oak, red | 1 |
Acer | Maple, sugar | 2 |
Betula | Birch, black | 2 |
Betula | Birch, paper | 2 |
Betula | Birch, yellow | 7 |
Acer | Maple, red | 20 |
Acer | Maple, striped | 24 |
Fagus | Beech, American | 125 |
trees_without_species.sort('count', descending=True) # named argument is "descending"
genus | common name | count |
---|---|---|
Fagus | Beech, American | 125 |
Acer | Maple, striped | 24 |
Acer | Maple, red | 20 |
Betula | Birch, yellow | 7 |
Acer | Maple, sugar | 2 |
Betula | Birch, black | 2 |
Betula | Birch, paper | 2 |
Quercus | Oak, red | 1 |
trees_without_species.sort('genus', descending=True)
genus | common name | count |
---|---|---|
Quercus | Oak, red | 1 |
Fagus | Beech, American | 125 |
Betula | Birch, yellow | 7 |
Betula | Birch, black | 2 |
Betula | Birch, paper | 2 |
Acer | Maple, striped | 24 |
Acer | Maple, red | 20 |
Acer | Maple, sugar | 2 |
Tying It All Together: Gifts to President Obama and Family¶
Terminology: method chaining is when the outputs from a previous method are used as inputs to the next method (in a single line).
# derived from: https://raw.githubusercontent.com/tacookson/data/master/us-government-gifts/gifts.csv
gifts = Table.read_table('data/obama-gifts-2010.csv')
# or for all 8 years: gifts = Table.read_table('data/obama-gifts.csv')
gifts.show(15)
donor_country | value_usd | gift_description |
---|---|---|
Bahrain | 430 | 8'' x 10'' Christolfe silver picture frame with a photog ... |
Spain | 660 | 6.25'' x 6.25'' silver plate engraved with royal signatu ... |
nan | 1410 | 30'' x 54'' multicolored Tibetan scroll depicting a Budd ... |
El Salvador | 1000 | Four piece metal sculpture of Don Quixote and Sancho Pan ... |
Ireland | 357 | Two first edition volumes of William Yates books, publis ... |
Ireland | 1545 | Anthology of Irish poetry, signed by six Irish poets. Re ... |
Afghanistan | 2000 | 312'' x 117'' burgundy and red carpet with intricate pat ... |
France | 2150 | Three bottles of 1961 Chateau Bages wine. Two 9'' x 13'' ... |
Russia | 2080 | Two 4'' Imperial porcelain painted eggs with porcelain b ... |
Czechia | 1100 | 14'' hand-blown Moser glass plate with 24- carat gold bo ... |
Nigeria | 8315 | Chris Aire gold cufflinks with centered citrine gemstone ... |
Singapore | 380 | Creative Aurvana Live! black headphones. Creative pocket ... |
Switzerland | 2604 | Longines silver-colored watch with brown band, presented ... |
China | 400 | 12.5'' Jun purple and gray porcelain vase with a round w ... |
Kazakhstan | 3200 | 9.5'' bowl made of silver, jasper, kahalong, and fianit, ... |
... (79 rows omitted)
gifts.sort('value_usd')
donor_country | value_usd | gift_description |
---|---|---|
Japan | 1.325 | 5'' titanium ``SUSGallery'' cup. 4'' titanium ``SUSGalle ... |
Senegal | 345 | 40'' x 70'' black wool shawl with colorful stripes. 16'' ... |
Laos | 350 | 20.75'' x 32'' framed silver-colored metal panel with an ... |
Egypt | 356 | 4.5'' x 3.5'' Bible in Arabic with metallic cross on cov ... |
Ireland | 357 | Two first edition volumes of William Yates books, publis ... |
Ukraine | 360 | 18'' x 62'' embroidered beige tapestry with red floral m ... |
Ukraine | 360 | White blouse with beaded floral design, in a maroon velv ... |
Senegal | 375 | 14'' x 9'' x 4'' Le Darmanco coral and navy blue handbag ... |
Singapore | 380 | Creative Aurvana Live! black headphones. Creative pocket ... |
Egypt | 380 | 24'' orange and black beaded silver-colored necklace wit ... |
... (84 rows omitted)
Terminology: named argument
gifts.sort('value_usd', descending=True)
donor_country | value_usd | gift_description |
---|---|---|
Nigeria | 8315 | Chris Aire gold cufflinks with centered citrine gemstone ... |
India | 7404 | Book, title: ``The Life of Mahatma Gandhi,'' by Louis Fi ... |
United Arab Emirates | 6125 | Cartier 18-carat white gold pearl earrings with diamonds ... |
Saudi Arabia | 5255 | Three Bijan silk tie sets with matching handkerchiefs in ... |
France | 4580 | 13.67'' Daum France crystal ultraviolet vase. Daum Franc ... |
Lebanon | 4520 | 32.25'' x 24.5'' watercolor of Arabic calligraphy by art ... |
Russia | 4000 | 83'' x 43'' framed oil painting of a carnival winter sce ... |
United Kingdom | 4000 | 39'' x 31'' canvas of graffiti letters spray- painted on ... |
Spain | 3600 | 20'' gold Pina & Grau necklace with 1'' Mallorcan cross ... |
Kazakhstan | 3200 | 9.5'' bowl made of silver, jasper, kahalong, and fianit, ... |
... (84 rows omitted)
sorted_by_value = gifts.sort('value_usd', descending=True)
sorted_by_value
donor_country | value_usd | gift_description |
---|---|---|
Nigeria | 8315 | Chris Aire gold cufflinks with centered citrine gemstone ... |
India | 7404 | Book, title: ``The Life of Mahatma Gandhi,'' by Louis Fi ... |
United Arab Emirates | 6125 | Cartier 18-carat white gold pearl earrings with diamonds ... |
Saudi Arabia | 5255 | Three Bijan silk tie sets with matching handkerchiefs in ... |
France | 4580 | 13.67'' Daum France crystal ultraviolet vase. Daum Franc ... |
Lebanon | 4520 | 32.25'' x 24.5'' watercolor of Arabic calligraphy by art ... |
Russia | 4000 | 83'' x 43'' framed oil painting of a carnival winter sce ... |
United Kingdom | 4000 | 39'' x 31'' canvas of graffiti letters spray- painted on ... |
Spain | 3600 | 20'' gold Pina & Grau necklace with 1'' Mallorcan cross ... |
Kazakhstan | 3200 | 9.5'' bowl made of silver, jasper, kahalong, and fianit, ... |
... (84 rows omitted)
sorted_by_value.sort('donor_country').show(30)
donor_country | value_usd | gift_description |
---|---|---|
Afghanistan | 2000 | 312'' x 117'' burgundy and red carpet with intricate pat ... |
Afghanistan | 2000 | 80'' x 148'' brown and blue patterned rug. Rec'd--5/12/2 ... |
Armenia | 2985 | 16'' x 16'' walnut chessboard with walnut and sterling s ... |
Austria | 625 | 32'' x 70'' Dries Van Noten patterned green silk shawl. ... |
Azerbaijan | 940 | Set of two brass and crystal tea cups, presented in a bl ... |
Bahrain | 430 | 8'' x 10'' Christolfe silver picture frame with a photog ... |
Brunei | 1932 | ``Lustre'' necklace with 1'' x 1'' inscribed ``Globe Cre ... |
Canada | 1000 | 17'' sugar maple-wood bowl signed by Don Thur. 24'' x 16 ... |
China | 1400 | White ``Flying Pigeon'' electric bicycle. Rec'd--9/27/20 ... |
China | 629 | 11'' x 9'' ``Expo 2010 Shanghai China'' blue vase with a ... |
China | 400 | 12.5'' Jun purple and gray porcelain vase with a round w ... |
Colombia | 525 | 8'' x 14'' Helena Caballero blue purse. Rec'd--9/24/2010 ... |
Croatia | 510 | Gena black three-piece suit Rec'd--4/26/2010. Est. Value ... |
Czechia | 1100 | 14'' hand-blown Moser glass plate with 24- carat gold bo ... |
Egypt | 380 | 24'' orange and black beaded silver-colored necklace wit ... |
Egypt | 356 | 4.5'' x 3.5'' Bible in Arabic with metallic cross on cov ... |
El Salvador | 1000 | Four piece metal sculpture of Don Quixote and Sancho Pan ... |
Finland | 445 | 9.5'' blue glass bird, hand-blown by Iittala craftsmen. ... |
France | 4580 | 13.67'' Daum France crystal ultraviolet vase. Daum Franc ... |
France | 2150 | Three bottles of 1961 Chateau Bages wine. Two 9'' x 13'' ... |
Haiti | 2094 | VeVe Collections teal handbag with sequins. Book, title: ... |
Haiti | 425 | 20'' x 16.5'' painting of Haitian women dressed in white ... |
Honduras | 560 | Two gold-colored Milla Guirst Joyeros cufflinks with May ... |
India | 7404 | Book, title: ``The Life of Mahatma Gandhi,'' by Louis Fi ... |
India | 2110 | Two 13'' x 17'' black and red scarves. 34'' gold-colored ... |
India | 1015 | 30'' x 42'' painting of bodhisattva on black and gold-co ... |
India | 710 | Two 10'' x 12'' miniature paintings depicting a small in ... |
India | 512 | 27'' x 78'' Kashmir Loom blue cashmere stole with orange ... |
India | 460 | 7'' long silver-colored jewelry box with bird designs, i ... |
Indonesia | 1658 | Two books. 36'' x 42'' blue Lenan Pearl of Silk scarf wi ... |
... (64 rows omitted)
sorted_by_value.sort('donor_country', distinct=True)
donor_country | value_usd | gift_description |
---|---|---|
Afghanistan | 2000 | 312'' x 117'' burgundy and red carpet with intricate pat ... |
Armenia | 2985 | 16'' x 16'' walnut chessboard with walnut and sterling s ... |
Austria | 625 | 32'' x 70'' Dries Van Noten patterned green silk shawl. ... |
Azerbaijan | 940 | Set of two brass and crystal tea cups, presented in a bl ... |
Bahrain | 430 | 8'' x 10'' Christolfe silver picture frame with a photog ... |
Brunei | 1932 | ``Lustre'' necklace with 1'' x 1'' inscribed ``Globe Cre ... |
Canada | 1000 | 17'' sugar maple-wood bowl signed by Don Thur. 24'' x 16 ... |
China | 1400 | White ``Flying Pigeon'' electric bicycle. Rec'd--9/27/20 ... |
Colombia | 525 | 8'' x 14'' Helena Caballero blue purse. Rec'd--9/24/2010 ... |
Croatia | 510 | Gena black three-piece suit Rec'd--4/26/2010. Est. Value ... |
... (44 rows omitted)
Visualizing Categorical Data¶
top_per_country = sorted_by_value.sort('donor_country', distinct=True)
top_per_country
donor_country | value_usd | gift_description |
---|---|---|
Afghanistan | 2000 | 312'' x 117'' burgundy and red carpet with intricate pat ... |
Armenia | 2985 | 16'' x 16'' walnut chessboard with walnut and sterling s ... |
Austria | 625 | 32'' x 70'' Dries Van Noten patterned green silk shawl. ... |
Azerbaijan | 940 | Set of two brass and crystal tea cups, presented in a bl ... |
Bahrain | 430 | 8'' x 10'' Christolfe silver picture frame with a photog ... |
Brunei | 1932 | ``Lustre'' necklace with 1'' x 1'' inscribed ``Globe Cre ... |
Canada | 1000 | 17'' sugar maple-wood bowl signed by Don Thur. 24'' x 16 ... |
China | 1400 | White ``Flying Pigeon'' electric bicycle. Rec'd--9/27/20 ... |
Colombia | 525 | 8'' x 14'' Helena Caballero blue purse. Rec'd--9/24/2010 ... |
Croatia | 510 | Gena black three-piece suit Rec'd--4/26/2010. Est. Value ... |
... (44 rows omitted)
# Bar chart: visualize categorical data
top_per_country.barh('donor_country', 'value_usd')

# Bar chart: visualize categorical data
top_per_country.sort('value_usd', descending=True).barh('donor_country', 'value_usd')

Table Operations: Selecting Rows¶
majors = Table.read_table("data/majors.csv")
majors
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
American Studies | 2 | 10 | 9 |
Anthropology | 2 | 8 | 4 |
Arabic Studies | 1 | 4 | 7 |
Art | 1 | 55 | 31 |
Asian Studies | 1 | 8 | 6 |
Astronomy | 3 | 1 | 2 |
Astrophysics | 3 | 3 | 3 |
Biology | 3 | 58 | 61 |
Chemistry | 3 | 30 | 34 |
Chinese | 1 | 11 | 9 |
... (27 rows omitted)
majors.sort("2018-2021", descending=True)
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
Economics | 2 | 84 | 105 |
Biology | 3 | 58 | 61 |
Mathematics | 3 | 53 | 61 |
Political Science | 2 | 58 | 54 |
Computer Science | 3 | 16 | 50 |
English | 1 | 65 | 50 |
Psychology | 3 | 62 | 45 |
History | 2 | 61 | 43 |
Chemistry | 3 | 30 | 34 |
Art | 1 | 55 | 31 |
... (27 rows omitted)
majors.where("Division", are.equal_to(3))
majors.where("Division", 3) # same
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
Astronomy | 3 | 1 | 2 |
Astrophysics | 3 | 3 | 3 |
Biology | 3 | 58 | 61 |
Chemistry | 3 | 30 | 34 |
Computer Science | 3 | 16 | 50 |
Geosciences | 3 | 7 | 12 |
Mathematics | 3 | 53 | 61 |
Physics | 3 | 12 | 13 |
Psychology | 3 | 62 | 45 |
Statistics | 3 | 0 | 16 |
Digression: Bar Charts for Multiple Variables¶
div3 = majors.where("Division", 3)
div3.drop("Division").barh("Major")

Terminology: method chaining
majors.where("Division", are.equal_to(3)).sort("2018-2021", descending=True)
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
Biology | 3 | 58 | 61 |
Mathematics | 3 | 53 | 61 |
Computer Science | 3 | 16 | 50 |
Psychology | 3 | 62 | 45 |
Chemistry | 3 | 30 | 34 |
Statistics | 3 | 0 | 16 |
Physics | 3 | 12 | 13 |
Geosciences | 3 | 7 | 12 |
Astrophysics | 3 | 3 | 3 |
Astronomy | 3 | 1 | 2 |
See complete list of are
conditions in our Python Reference.
majors.where("Division", are.not_equal_to(3)).sort("2018-2021", descending=True)
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
Economics | 2 | 84 | 105 |
Political Science | 2 | 58 | 54 |
English | 1 | 65 | 50 |
History | 2 | 61 | 43 |
Art | 1 | 55 | 31 |
Philosophy | 2 | 19 | 16 |
Political Economy | 2 | 16 | 13 |
Environmental Studies | 2 | 0 | 12 |
Comparative Literature | 1 | 5 | 11 |
French | 1 | 6 | 11 |
... (17 rows omitted)
majors.where("2018-2021", are.above(30))
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
Art | 1 | 55 | 31 |
Biology | 3 | 58 | 61 |
Chemistry | 3 | 30 | 34 |
Computer Science | 3 | 16 | 50 |
Economics | 2 | 84 | 105 |
English | 1 | 65 | 50 |
History | 2 | 61 | 43 |
Mathematics | 3 | 53 | 61 |
Political Science | 2 | 58 | 54 |
Psychology | 3 | 62 | 45 |
majors.where("2018-2021", are.between(10,20))
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
Comparative Literature | 1 | 5 | 11 |
Environmental Studies | 2 | 0 | 12 |
French | 1 | 6 | 11 |
Geosciences | 3 | 7 | 12 |
Philosophy | 2 | 19 | 16 |
Physics | 3 | 12 | 13 |
Political Economy | 2 | 16 | 13 |
Sociology | 2 | 9 | 11 |
Spanish | 1 | 8 | 10 |
Statistics | 3 | 0 | 16 |
majors.where("Major", are.containing('ics'))
Major | Division | 2008-2012 | 2018-2021 |
---|---|---|---|
Astrophysics | 3 | 3 | 3 |
Classics | 1 | 5 | 6 |
Economics | 2 | 84 | 105 |
Mathematics | 3 | 53 | 61 |
Physics | 3 | 12 | 13 |
Statistics | 3 | 0 | 16 |