CSCI 104: Understanding Data Through Computation

Katie Keith and Stephen Freund

Many of the world’s greatest discoveries and most consequential decisions are enabled or informed by the analysis of data from a myriad of sources. Indeed, the ability to organize, visualize, and draw conclusions from data is now a critical tool in the sciences, business, medicine, politics, other academic disciplines, and society as a whole. This course lays the foundations for reasoning about data by exploring complementary computational, statistical, and visualization concepts. These concepts will be reinforced by lab experiences designed to teach programming and statistics skills while analyzing real-world data sets. This course will also examine the broader context and social issues surrounding data analysis, including privacy and ethics.

Remote Attendance via Zoom

If you must miss lecture, lab, or office hours due for health reasons, please contact us to make alternative arrangements. We will either provide access to the necessary materials in Glow or plan for you to attend via this zoom link. You must be on campus or logged into the college’s VPN to access that link.

Calendar

Due Dates

Pre-Labs are due Monday at 10am.

Labs are due:

  • Wednesday at 10pm for the Monday lab groups.

  • Thursday at 10pm for the Tuesday lab groups.

Any exceptions and other due dates are listed on the calendar.

Mon

Tue

Wed

Thu

Fri

09/05

09/06

09/07

09/08

09/12

Cause & Effect

09/13

09/15

09/16

Data Types

09/19

Columns and Rows

09/20

09/21

Tables and Visualization

09/22

10/03

Pivots and Joins

10/04

10/05

Case Study 1: Sweeny Linking Study

10/06

10/07

Mountain Day Placeholder

10/10

Reading Period

10/11

Reading Period

10/12

Table Examples

10/13

10/14

Conditionals and Iteration

10/17

Chance and Simulation

10/18

10/19

Sampling

10/20

10/21

Empirical Distributions

10/24

Assessing Models

10/25

10/26

Decisions and Uncertainty

10/27

10/28

Hypothesis Testing

10/31

Causality

11/01

11/02

Examples

11/03

Evening Midterm

11/04

No class!

11/07

Case Study 2: Patreon Hacked Data

11/08

11/09

Bootstrapping

11/10

11/11

Interpreting Confidence

11/14

Center and Spread

11/15

11/16

The Normal Distribution

11/17

11/18

Sample Means

11/21

Designing Experiments

11/22

11/23

Thanksgiving Break

11/24

Thanksgiving Break

11/25

Thanksgiving Break

11/28

Correlation

11/29

11/30

Linear Regression

12/01

12/02

Least Squares

12/05

Case Study 3: Mobile Location Data and Wrap Up

12/06

12/07

TBD

12/08

12/09

TBD

12/12

12/13

12/14

12/15

12/16

Final Project Due (4pm)