Syllabus#

Website:

http://www.cs.williams.edu/~cs104/

Instructors:

Stephen Freund (TPL 302)

Q&A and Discussion:

Piazza

Staff Email:

cs104staff@williams.edu

Lectures:

MWF 10:00-10:50am and 11:00-11:50am in Schow 030B

Labs:

MT 1:00-2:30 and 2:30-4:00 in Schow 027

Course Description#

Many of the world’s greatest discoveries and most consequential decisions are enabled or informed by the analysis of data from a myriad of sources. Indeed, the ability to organize, visualize, and draw conclusions from data is now a critical tool in the sciences, business, medicine, politics, other academic disciplines, and society as a whole. This course lays the foundations for reasoning about data by exploring complementary computational, statistical, and visualization concepts. These concepts will be reinforced by lab experiences designed to teach programming and statistics skills while analyzing real-world data sets. This course will also examine the broader context and social issues surrounding data analysis, including privacy and ethics.

Learning Outcomes#

By the end of the course, students will be able to

  • Apply practical data science and Python programming skills to any domain.

  • Represent, wrangle, manipulate, and visualize real-world datasets using Python programming.

  • Implement and apply computational approaches to statistical inference.

  • Perform simulation to test hypotheses, bootstrapping to estimate confidence intervals, and linear regression to make predictions.

Organization#

We will meet three times each week for lecture, and once a week for lab. During lecture hours we will learn new concepts and problem solving strategies. During the 90-minute lab section, we will gain hands-on experience with the concepts through programming assignments.

Piazza Discussion Forum and Emailing Us#

We will use Piazza for announcements, questions, answers, and online class discussions. Please check for announcements there regularly, and post any questions about logistics and material on Piazza so everyone can benefit from the discussion. And if you someone posts a question you can answer, go for it! You may post to the forum anonymously if you prefer.

If you have personal matters you’d like to discuss, please email us.

Course Components#

Lectures#

There are two lecture sections. Please attend the section that you are registered for.

Labs#

Each week, you will work on a lab assignment covering the topics from lecture. These labs will involve programming in the Python programming language. Our computing platform where you will complete your work is accessible via a web browser from any computer. We will provide computers to use during scheduled lab meetings. These will be available outside of lab hours for you to use as well. You may also use your own computer.

We will post labs on Friday so that you may look them over before our scheduled lab meetings on Mondays and Tuesdays. Labs are due:

  • Wednesday at 10pm for the Monday lab groups.

  • Thursday at 10pm for the Tuesday lab groups.

Late Days for Labs#

Each student may use a maximum of three late days for labs during the course of the semester. A single late day enables you to hand in a lab up to 24 hours after the original due date.

You may use one or more late days on a lab by filling out our late day form prior to the submission deadline. This form will be checked after every lab deadline so there is no need to email the instructors.

Once those late days are used up, late labs will be penalized 20% per day.

You may not use late days for prelabs, case studies, projects, or midterms.

Prelabs and Case Studies#

We’ll also assign short prelab exercises to help you prepare for the labs. We will post prelabs on Wednesday, and you will complete them by Monday morning. Prelabs are designed to help you prepare to have a productive scheduled lab session. You may not use late days for prelabs.

You will also read and complete several short case studies during the semester. These studies relate the technical material to ethical and privacy questions surrounding the analysis of data.

Midterm Exams#

The first exam will tentatively be on the evening of Thursday, Oct. 24. The second will be tentatively on the evening of Thursday, Nov. 21. You may attend either of two time slots on those evenings: 6-7:30pm or 8-9:30pm. The exams will be closed book, closed notes, and will focus on conceptual understanding of the material. Details regarding the specific format of the exam will be discussed in class.

Notify us at least two weeks in advance if you have known conflicts with the exam times.

Projects#

You will have two larger projects in the course. The midterm project will be due on Thursday, Oct. 17. The second will be a final project synthesizing the concepts and skills from the entire semester. It will be due on Friday, Dec 13.

Grading#

Final grades will reflect the course components as follows.

Component

Weight

Labs

30%

Prelabs and Case Studies

5%

Midterm Exam

12.5%

Midterm Project

15%

Second Midterm Exam

12.5%

Final Project

25%

Workload#

Attendance is required in both lecture and lab. In general, beyond the 4 hours we spend together during our class and lab meetings, you should expect to spend (on average) approximately 7-8 hours per week on work related to class. Aside from the weekly lab and prelab assignments, you are responsible for reading supporting material and investigating our online resources as necessary.

Expectations and Norms#

You can expect us (the instructors) to:

  • Contribute to and support a respectful and welcoming environment.

  • Start and end class on time.

  • Craft lectures and assignments designed to help you learn the material.

  • Reply to Piazza posts and emails within 24 hours on weekdays and 48 hours on weekends.

We can expect you (the students) to:

  • Contribute to and support a respectful and welcoming environment.

  • Attend all lecture and lab sessions in person except for health emergencies or extenuating circumstances.

  • Arrive to class and lab on time, and plan to stay until the end.

  • Stay engaged in the class and material.

  • Reach out for help from the TAs or instructors.

  • Adhere to the Honor Code.

Class Norms:

  • If you become sick with COVID or another illness please stay home and let us know via email.

  • If must miss class for other reasons please give us as much advance notice as possible.

  • The Computer Science department strives to be a friendly and welcoming community. You may find it slightly less formal (but no less respectful) than what you encountered in previous academic settings. For example, most students and faculty address us by our first names (i.e. Steve). You are welcome to do so as well.

  • You are also welcome to address us informally in email (i.e. starting an email with “Hi Steve.”) Here are a few other tips for emailing professors if that is something new to you out of your comfort zone.

  • We have set student help hours. Feel free to use these times to discuss questions adjacent to the course.

  • Steve uses he/him pronouns. We will try to use your preferred pronouns, as indicated in PeopleSoft. Please don’t hesitate to correct us.

Honor Code#

For computer assignments in computer science courses, the honor code is interpreted in very specific ways. Labs are expected to be the work of the individual student unless otherwise designated, designed and coded by them alone. Help locating errors and interpreting error messages is allowed, but a student may only receive help in correcting errors of syntax; help in correcting errors of logic is strictly forbidden. In general, if you are taking photos of someone else’s screen, looking at someone else’s screen, or telling someone else what to type, it is likely the work is no longer the work of an individual student.

The College and Department also have computer usage policies that apply to courses that make use of computers. Read more about these policies here.

Outside references: When completing work for this class, you may refer to any course materials and the resources linked to from any of our web pages. You may not use other sources to look for solutions to any assignments, and you may never copy prose or code directly from another source. At times, we will ask you to find data sets or explore additional topics on the web – we will make our expectations for consulting other references explicit in those cases.

Sharing Solutions. Please do not post your solutions to our assignments in any public forum, including public GitHub repositories. Students taking the course should not be looking for solutions, but tempting them by making solutions available is inappropriate. This applies not just to the semester you are taking the course, but to the future as well.

If in doubt as to what is appropriate, do not hesitate to ask. We’re happy to discuss this anytime.

Intellectual Property#

As per College policy, no part of this course may be reproduced and/or distributed. In particular, no videos recorded as part of this class may be shared with anyone external to the CS104 course.

Academic Resources on Campus#

There are many academic resources available to you on campus. Here are a few that may be useful throughout the semester.

Accommodations#

If formal accommodations need to be made to meet your specific learning or physical abilities, you should contact your instructors as possible to discuss appropriate accommodations. You should also contact the Director of Accessible Education, Dr. G. L. Wallace (x4672) or the Dean’s office (x4171). We will work together to ensure this class is accessible and inclusive.

Mental Health#

If you are experiencing mental or physical health challenges that are significantly affecting your academic work, you are encouraged to contact your instructor and/or speak with Dean’s Office staff (x4171).

Public Health#

In an attempt to keep our classroom and lab environments as healthy as possible, students and staff will be required to wear a mask at all times in the classroom and lab. If you feel ill, please do not come to class or lab and let us know if you are unable to attend class due to COVID restrictions. We will work with you to make sure you can make up any missed work, and to develop a plan that allows you to continue making progress in the course during your time in isolation/quarantine.

Acknowledgements#

This course was developed by Steve Freund and Katie Keith. CSCI 104 uses materials from the Data 8 course at the University of California, Berkeley, as well as publicly-available materials and data sets from a variety of other sources, including UMass COMPSCI/STATS 190F and Cornell CS 1380.