CSCI 374

Machine Learning

Instructor Information

Professor:Andrea Danyluk
email:andrea "at" cs "dot" williams "dot" edu
phone: x2178
Office:TCL 305
Office Hours: If my door is open, you're welcome to come by (both weekdays and weekends). Scheduled hours Mon 1:30-3:30pm, Thurs, 1:30-2:30pm.

Course Description

Machine Learning is an area within Artificial Intelligence that has as its aim the development and analysis of algorithms that are meant to automatically improve a system's performance. Automatic improvement might include: (1) learning to perform a new task; (2) learning to perform a task more efficiently or effectively; or (3) discovering and organizing new facts that can be used by a system that relies upon such knowledge. Consider, for example, prediction of movie ratings (Netflix), beating a professional player at the game of Go (DeepMind's AlphaGo), driving a car (Google), or serving as a personal assistant (Apple's Siri).

Machine learning algorithms have influenced work in areas other than Artificial Intelligence itself. For example, some of the algorithms developed have contributed to the study of human learning in cognitive science and cognitive psychology. Recent practical applications of machine learning algorithms have included data mining for discovery of new information from scientific data, for fraud detection, and for market analysis.

There are many ways to categorize machine learning algorithms. One is based on the amount and type of background information provided to the algorithm. This course will introduce specific instances of two categories of learning algorithms - supervised algorithms and unsupervised algorithms. It will also introduce methods for evaluating the performance of machine learning algorithms. Finally, it will introduce topics in computational learning theory as well as instances of particular applications.

Work in the course will involve reading, solving problems, running and evaluating learning algorithms, and implementing some as well.

Assignments and Weekly Meetings

As you know, this course will be taught as a tutorial. You and your tutorial partner will normally meet with me each week to discuss the readings and exercises I assigned for that week. (Meetings will be held in my office - TCL 305.) In the "canonical" tutorial format (most typically followed in Divisions I and II), all students are responsible for the readings, but each week one student in each group is primarily responsible for the presentation (typically a paper), while the other is responsible for providing a critique of the tutorial partner's work. I may occasionally assign specific roles to tutorial partners. However, because of the cumulative nature of the material, I will, in general, expect each of you to come to each meeting fully prepared to present analyses of the readings and solutions to the exercises.

A hallmark of tutorials is that they make students more responsible for learning than they might be in a "standard format" course. Therefore, one of my goals will be to help you develop as an independent learner. A hallmark of computer science as a discipline is its collaborative nature. Thus a second (but not secondary) goal is to provide you with opportunities to develop your skills as collaborators. Here's how we'll balance the two: You will have the option of working alone or with classmates. (The exception is that paper reviews must be done independently.) You must, however, write up your work independently. (Again, there is an exception: programming may be done in teams of two.)

Remember that I'm also here to help you. In tutorials students sometimes get into a mindset where they feel they can't ask the professor for help. This sort of independence can be great, but don't allow it to get in the way of progress. If you're stuck, ask me! For more details on partners, writing up assignments, etc., please see the Honor Code Guidelines.

For more on assignments and weekly meetings, see the week-by-week schedule.

Final Assignments/Projects

During the final two weeks of the semester, you and your tutorial partner will have the opportunity to explore a topic of your own choosing in more detail. Your assignment for the penultimate week of classes will be to design and write an assignment in the style and at the level of those I have prepared for you. During the final week, you will complete that assignment. I will provide more detail as we get closer to the end of the semester.

Final Exam

There will be no final exam in this course.

Grading Policy

Each week of work accounts for one-twelfth of your final grade. In determining your weekly grade, I will consider your level of preparation for the tutorial session, your active and productive participation in the session, the clarity of your presentation and contributions, as well as the correctness of your written work.

Assignments are due at the end of the tutorial session during which they are discussed. In order for our weekly sessions to be as interesting and productive as possible, it is absolutely essential that you be prepared. Please bring your work with you, so that you can discuss it and then leave it with me.

Honor Code Guidelines

Explanation of the Honor Code as it pertains to this course:

I. Explanation related to help from people (i.e., non-written sources):

Reading responses / paper reviews: These are expected to be your work alone. When assigned a reading for which you are also writing a review, do not discuss the paper or your response with anyone other than the instructor (i.e., me). This will ensure that our tutorial session discussions will be fresh and interesting.

Regular weekly non-programming assignments other than reading responses or paper reviews: You may work alone or with classmates.

If you choose to work with classmates, you may discuss the problems and work out your solutions together, but you must write them up individually. You must also clearly cite your partner(s) on the work you turn in. Also clearly site any written sources outside of those assigned.

Consulting any person outside of the class is a violation of the honor code.

Programming assignments: On each programming assignment you will have the option of working either alone or with one classmate. The partner need not be your tutorial partner, though working with your tutorial partner on code will be convenient for tutorial meetings. Programs that you turn in (just one if working with a classmate) must contain only: code written solely by your group or code written by yourself or other CS 374 students this semester for previous assignments. Code from previous assignments must be clearly credited where it is used and in a separate written note to me. It may be used only with permission of the students involved.

You are welcome to discuss design, debugging, and mathematics related to programming projects with other students, but you may not review the code of other students for any current assignment. Note that using previously submitted code and discussing programming projects is a more liberal policy than the default CS department policy.

Recall that in accordance with CS department policies, looking at any other computer user's files without permission is unacceptable, regardless of whether those files are protected on the file system.

II. Explanation related to help from written sources (paper or electronic):

I have provided a number of resources both in hard copy (in the lab) and electronically. You are welcome to use those resources when working on your assignments.

If you are using a resource not specifically mentioned in an assignment write-up, you must cite it.

Source code for the Weka data mining toolkit is freely available. Do not use this as a "resource" when doing your own programming. All code you submit should be your own, with the exception of shared code as outlined above. If you have any questions about code resources you can or cannot use, please ask!