CSCI 104: Understanding Data Through Computation

Fall 2022

Many of the world’s greatest discoveries and most consequential decisions are enabled or informed by the analysis of data from a myriad of sources. Indeed, the ability to organize, visualize, and draw conclusions from data is now a critical tool in the sciences, business, medicine, politics, other academic disciplines, and society as a whole. This course lays the foundations for reasoning about data by exploring complementary computational, statistical, and visualization concepts. These concepts will be reinforced by lab experiences designed to teach programming and statistics skills while analyzing real-world data sets. This course will also examine the broader context and social issues surrounding data analysis, including privacy and ethics.

Is This the Class for You?

This class studies core computing and data science concepts, using computation as a lens for exploring data manipulation, visualization, and statistics. However, no prior computer science, programming, or statistics experience is required or expected. Indeed, CSCI 104 is accessible to all students wishing to learn about these topics, regardless of academic background or interests.

Format and Topics

Lectures and weekly labs. Lab assignments will use the Python programming language and existing data science tools to apply concepts from lecture to data sets drawn from a wide range of sources.

Topics include: cause and effect, manipuating tables, visualizing data, simulation and chance, models, uncertainty, A/B testing, causality, experiment design, and inference.

Learning Outcomes

  • Leverage the combination of computer science and statistics.

  • Come away with practical data science skills applicable to any domain.

  • Be able to run experiments, test hypotheses, and draw inferences.

  • Quantify and understand uncertainty in data.

  • Illustrate the above concepts with real-world data from a variety of domains.