Computer Science 374
Machine Learning

Fall 2005


Professor: Andrea Danyluk
Office: TCL 308
Email: andrea "at" cs "dot" williams "dot" edu
Phone: x2178
Office Hours: Mon. 1-1:50, Tues. 10:30-12, Wed. 11-12, and by appointment.

Text
Introduction to Machine Learning by Ethem Alpaydin, MIT Press, 2004. See http://www.cmpe.boun.edu.tr/~ethem/i2ml for more information on the text, including Errata.

Course Overview

Machine Learning is an area within Artificial Intelligence that has as its aim the development and analysis of algorithms that are meant to automatically improve a system's performance. Automatic improvement might include: (1) learning to perform a new task; (2) learning to perform a task more efficiently or effectively; or (3) learning and organizing new facts that can be used by a system that relies upon such knowledge.

Machine learning algorithms have influenced work in areas other than Artificial Intelligence itself. For example, some of the algorithms developed have contributed to the study of human learning in cognitive science and cognitive psychology. Recent practical applications of machine learning algorithms have included data mining for discovery of new information from scientific data, for fraud detection, and for market analysis.

There are many ways to categorize machine learning algorithms. One is based on the amount and type of background information provided to the algorithm. This course will introduce specific instances of two categories of learning algorithms - supervised algorithms and unsupervised algorithms. It will also introduce methods for evaluating the performance of machine learning algorithms. Finally, it will introduce topics in computational learning theory.

Work in the course will involve reading, solving problems, running and evaluating learning algorithms, and implementing some as well.

Assignments and Weekly Meetings

As you know, this course will be taught as a tutorial. Each of you will be assigned to a group of two (or three) students. Your group will normally meet with me each week to discuss the readings and exercises I assigned for the week. (Meetings will be held in my office - TCL 308.) In the "canonical" tutorial format, one student in each group is primarily responsible for the presentation each week. Because of the cumulative nature of the material we will be considering, I will expect each of you to come to each meeting fully prepared to present summaries of the readings and solutions to the exercises. To compensate for the higher workload such an approach entails compared to the "canonical" tutorial, you are encouraged to work on the assignments with others taking the course. To make our meetings more interesting, however, I ask that you not work as one large class group.

The tutorial format naturally emphasizes learning more than the specific material covered by the course. In particular, this format can have the very positive effect of encouraging you to develop your ability to learn new material independently and to improve your ability to present your thoughts orally. Therefore, in completing the assignments, you should do more than prepare written solutions. You should take some time to think about how you will present your solutions. I do not mean that you should prepare a lecture. Rather, take the time to look over each reading and decide what the major points are. Do the same with the exercises.

For more on assignments and weekly meetings, see the week-by-week schedule.

"Favorite Applications" Presentation

During the week of December 5 (the final week of the semester), we will skip our usual meetings and instead meet as a group. During this time, each of you will give a short presentation on the machine learning application of your choice.

Final Exam

There will be only one exam - the final exam. You will be given a specific research paper on a current topic in machine learning, and you will be asked to analyze the paper and provide a critique of the work. The exam will be given as a 24-hour take-home exam through the Registrar's Office.

Grading Policy

In determining final grades, the following weighting will be used: weekly meetings and assignments 75%, "Favorite Applications" presentation 10%, final exam 15%.

Assignments are due at the end of the tutorial session during which they are discussed. In order for our weekly sessions to be as interesting and productive as possible, it is absolutely essential that you be prepared. Please bring your work with you, so that you can discuss it and then leave it with me to be graded.

Honor Code

Final Exam: You may refer to your text, your course notes, and your assignments while taking the final exam, and you may talk to me. No other sources of information are permitted.

Homework: Collaboration on problems is permitted and even encouraged, as noted above. However, copying of solutions is not. The work you hand in should be your own. A good rule to follow is to work through the problems with others, taking only rough notes. You should then write your solutions independently, referring to your notes as little as possible. The idea is to understand each solution well enough that you can reconstruct it by yourself. In addition, you should write on each assignment the list of people with whom you collaborated.