CSCI 374

Machine Learning

Instructor Information

Professor:Andrea Danyluk
email:andrea "at" cs "dot" williams "dot" edu
phone: x2178
Office:TCL 305
Office Hours:Monday 9PM-11PM, Tuesday 9PM-11PM, Thursday 1:30PM-2:30PM. You can find me in my office most Sundays. My schedule is pretty flexible on Mondays and Fridays. Tuesdays and Wednesdays\ are a bit more full with various meetings, but feel free to try then as well.<\ /td>

Course Description

Machine Learning is an area within Artificial Intelligence that has as its aim the development and analysis of algorithms that are meant to automatically improve a system's performance. Automatic improvement might include: (1) learning to perform a new task; (2) learning to perform a task more efficiently or effectively; or (3) learning and organizing new facts that can be used by a system that relies upon such knowledge.

Machine learning algorithms have influenced work in areas other than Artificial Intelligence itself. For example, some of the algorithms developed have contributed to the study of human learning in cognitive science and cognitive psychology. Recent practical applications of machine learning algorithms have included data mining for discovery of new information from scientific data, for fraud detection, and for market analysis.

There are many ways to categorize machine learning algorithms. One is based on the amount and type of background information provided to the algorithm. This course will introduce specific instances of two categories of learning algorithms - supervised algorithms and unsupervised algorithms. It will also introduce methods for evaluating the performance of machine learning algorithms. Finally, it will introduce topics in computational learning theory.

Work in the course will involve reading, solving problems, running and evaluating learning algorithms, and implementing some as well.

Assignments and Weekly Meetings

As you know, this course will be taught as a tutorial. Each of you will be assigned to a group of two (or three) students. Your group will normally meet with me each week to discuss the readings and exercises I assigned for the week. (Meetings will be held in my office - TCL 305.) In the "canonical" tutorial format, all students are responsible for the readings, but each week one student in each group is primarily responsible for the presentation (typically a paper), while the other is responsible for providing a critique of the tutorial partner's work. There will be a few weeks in which I will assign specific roles to tutorial partners. However, because of the cumulative nature of the material we will be considering, I will, in general, expect each of you to come to each meeting fully prepared to present analyses of the readings and solutions to the exercises. To compensate for the somewhat higher workload such an approach entails compared to the "canonical" tutorial, you are encouraged to work on the assignments with others taking the course. To make our meetings more interesting, however, I ask that you not work as one large class group. In general, our meetings will be most interesting if you have worked with someone other than your tutorial partner.

The tutorial format naturally emphasizes learning more than the specific material covered by the course. In particular, this format can have the very positive effect of encouraging you to develop your ability to learn new material independently and to improve your ability to present your thoughts orally. Therefore, in completing the assignments, you should do more than prepare written solutions. You should take some time to think about how you will present your solutions. I do not mean that you should prepare a lecture. Rather, take the time to look over each reading and decide what the major points are. Do the same with the exercises.

For more on assignments and weekly meetings, see the week-by-week schedule.

"Favorite Applications" Presentation

During the week of May 5 (the final week of the semester), we will skip our usual meetings and instead meet as a group. During this time, each of you will give a presentation on the machine learning application of your choice. There is a chance that this activity will be cancelled or modified. I am currently in conversation with a colleague at a research university about the possibility of a field trip. If we're able to schedule that, then this is the activity that will be shifted. In addition, one student has asked me about the possibility of doing a large programming project in the course. I am open to the idea of some of you presenting your own projects, rather than applications you've researched.

Final Exam

There will be only one exam - the final exam. You will be given a specific research paper on a current topic in machine learning, and you will be asked to analyze the paper and provide a critique of the work. The exam will be given as a 24-hour take-home exam through the Registrar's Office.

Grading Policy

In determining final grades, the following weighting will be used: weekly meetings and assignments 75%, "Favorite Applications" presentation 10%, final exam 15%.

Assignments are due at the end of the tutorial session during which they are discussed. In order for our weekly sessions to be as interesting and productive as possible, it is absolutely essential that you be prepared. Please bring your work with you, so that you can discuss it and then leave it with me to be graded.

Honor Code Guidelines

[Adapted from the Honor Code Guidelines for Computer Science Courses.]

For programming assignments in this course, the honor code is interpreted in very specific ways. When a program is assigned, your instructor will identify it as an "individual" or "team" assignment. The Honor Code applies differently to each with respect to collaboration or assistance from anyone other than the instructor.

Individual Programs. Individual programs are expected to be the work of the individual student, designed and coded by him or her alone. Help locating errors is allowed, but a student may only receive help in correcting errors of syntax; help in correcting errors of logic is strictly forbidden. Guideline: Assistance from anyone other than the instructor in the design or coding of program logic will be considered a violation of the honor code.

Team Programs. Team programs are to be worked on in teams of two students. You are allowed to discuss team programs with your partners, but work with others is otherwise restricted as above. That is, others can help in correcting errors of syntax, but help in correcting errors of logic is forbidden. Guideline: Any work that is not the work of your team is considered a violation of the honor code.

If you do not understand how the honor code applies to a particular assignment, consult your instructor.

Students should be aware of the Computer Ethics outlined in the Student Handbook. Violations (including uninvited access to private information and malicious tampering or theft of computer equipment or software) are subject to disciplinary action.

Guideline: To protect your work dispose of printouts carefully, and avoid leaving yourself logged in to public computers when you aren't in the lab.

The Honor Code as it applies to non-programming assignments is outlined in the Student Handbook. In addition, the following guidelines are in place for homework assignments and the final exam:

Homework. Collaboration on problems is permitted and even encouraged, as noted above. However, copying of solutions is not. The work you hand in should be your own. A good rule to follow is to work through the problems with others, taking only rough notes. You should then write your solutions independently, referring to your notes as little as possible. The idea is to understand each solution well enough that you can reconstruct it by yourself. In addition, you should write on each assignment the list of people with whom you collaborated.

Final Exam. You may refer to the books listed as Resources for this course, your own course notes, and your assignments while taking the final exam, and you may talk to me. No other sources of information are permitted.

The Department of Computer Science takes the Honor Code seriously. Violations are easy to identify and will be dealt with promptly.