CSCI 104: Understanding Data Through Computation

Katie Keith and Stephen Freund

Many of the world’s greatest discoveries and most consequential decisions are enabled or informed by the analysis of data from a myriad of sources. Indeed, the ability to organize, visualize, and draw conclusions from data is now a critical tool in the sciences, business, medicine, politics, other academic disciplines, and society as a whole. This course lays the foundations for reasoning about data by exploring complementary computational, statistical, and visualization concepts. These concepts will be reinforced by lab experiences designed to teach programming and statistics skills while analyzing real-world data sets. This course will also examine the broader context and social issues surrounding data analysis, including privacy and ethics.

Remote Attendance via Zoom

If you must miss lecture, lab, or office hours due for health reasons, please contact us to make alternative arrangements. We will either provide access to the necessary materials in Glow or plan for you to attend via this zoom link. You must be on campus or logged into the college’s VPN to access that link.

Calendar

Due Dates

Pre-Labs are due Monday at 10am.

Labs are due:

  • Wednesday at 10pm for the Monday lab groups.

  • Thursday at 10pm for the Tuesday lab groups.

Any exceptions and other due dates are listed on the calendar.

Welcome
09/09 F Welcome Ch 1 Slides, Notebook Prelab 1 9/14@10am
09/12 M Cause & Effect Ch 2 Slides Lab 1
Python, Data Wrangling, and Visualization
09/14 W Tables Ch 3 Slides, Notebook Prelab 2
09/16 F Data Types Ch 4, 5 Slides, Notebook Lab 2
09/19 M Columns and Rows Ch 6.1, 6.2 Slides, Notebook
09/21 W Tables and Visualization Ch 6.3, 6.4, Ch 7 Slides, Notebook Prelab 3
09/23 F Charts Ch 7, 7.1 Slides, Notebook Lab 3
09/26 M Histograms Ch 7.2, 7.3 Slides, Notebook
09/28 W Functions Ch 8, 8.1, System Error Ch 5 Slides, Notebook Prelab 4 Case 1 10/05@10am
09/30 F Groups Ch 8.2, 8.3 Slides, Notebook Lab 4
10/03 M Pivots and Joins Ch 8.4 Slides, Notebook
10/05 W Case Study 1: Sweeny Linking Study System Error Ch 5 Slides, Notebook
10/07 F Table Examples Ch 8.5 Slides, Notebook
10/10 M Reading Period
10/11 T Reading Period
Distributions and Random Sampling
10/12 W Conditionals and Loops Ch 9, 9.1, 9.2 Slides, Notebook Prelab 5 Project 1
10/23@10pm checkpoint
10/27@10pm
10/14 F Mountain Day Placeholder Lab 5
10/17 M Chance and Simulation Ch 9.3, 9.4 Slides, Notebook
10/19 W Sampling Ch 9.5, 10 Slides, Notebook
10/21 F Inference with Statistics Ch 10.2, 10.3, 10.4 Slides, Notebook
Hypothesis Testing
10/24 M Assessing Models Ch 11.1, 11.2 Slides, Notebook
10/26 W Hypothesis Testing Ch 11.3, 11.4 Slides, Notebook, Sample Midterm, Solutions Prelab 6
10/28 F Statistical Significance Ch 11.4, 12.1 Slides, Notebook Lab 6 11/06@10pm
10/31 M Causality Ch 12.2 Slides, Notebook Case 2 11/07@10am
11/02 W Randomized Controlled Experiments Ch 12.3 Slides, Notebook
11/03 R Evening Midterm Midterm Details
11/04 F No class!
11/07 M Case Study 2: Consequentialism and Deontology Bit-by-Bit Ch 6.5, Bit-by-Bit Ch 6.6 (Skim) Slides Lab 7
Final Project Partners 11/11@10am
Estimation
11/09 W Bootstrapping Ch 13, 13.1, 13.2 Slides, Notebook Prelab 8 Project 2
11/18@4pm checkpoint
12/16@4pm
11/11 F Interpreting Confidence Ch 13.3, 13.4 Slides, Notebook Lab 8
Prediction
11/14 M Confidence and Correlation Ch 15, 15.1 Slides, Notebook, Midterm Solutions Case 3 11/22@4pm
11/16 W Correlation Coefficients Slides, Notebook
11/18 F Linear Regression Slides, Notebook Project Meetings Prelab 9 11/22@4pm
11/21 M Linear Regression: Optimization and Diagnostics Ch 15.5, 15.6, 16.3 Slides, Notebook
11/23 W Thanksgiving Break
11/24 R Thanksgiving Break
11/25 F Thanksgiving Break
11/28 M Case Study 3: Mobile Location Data Serkez (NYT), 2020,
Covid Mobility Site
,
Chang et al 2020
Slides Lab 9
11/30 W Wrap Up Slides, Notebook CS 104 Survey
12/02 F Katie's Research Slides
12/05 M Steve's Research Slides
12/07 W No class -- project meetings
12/09 F No class -- project meetings
12/16 F Final Project Due (4pm)