Computer Science

Williams College

This Week | Coming Up | Past Talks | Contact Us

Weekly Colloquium

September 7, 2007

Organizational meeting, CoSSAC nominations, pictures, colloquia schedule, etc.

September 14 , 2007

"What I Did This Summer"
Computer Science students Wasin Vipismakul, Son Ho, Rhaad Rabbani, Greg Tobkin, Ben Wood, Jeff Marsceill, Jing CAo, Alexandra French discussed their summer research/work experiences.

September 21, 2007

Kenn Knowles
Ph.D. Candidate
University of California at Santa Cruz

" Hybrid Type Checking"

Traditional static type systems are effective for expressing and verifying basic interface specifications, but are limited in the specifications that they support. Contracts which are checked at run time, on the other hand, support any computable specification (expressed in the language at hand) but suffer from late and incomplete detection of violations. Hybrid Type Checking synthesizes these approaches by supporting types with the expressiveness of contracts, which are enforced statically when possible, but with some enforcement of specifications delayed until run time. Kenn explored the theoretical and practical aspects of Hybrid Type Checking and presented some experimental results as to its effectiveness in a prototype language, showing in the process how other features such as gradual typing and first-class types are handled by the same mechanism.

September 28, 2007

Gerome Miklau, Ph.D.
Computer Science Department
University of Massachusetts at Amherst

"Privacy and the Public Use of Social Network Data"

Social network analysis is concerned with uncovering patterns in graphs that describe entities and the social connections between them. It has been widely applied to study the influence or popularity of individuals in organizations, disease transmission in communities, the operation of computer networks, and the emergent behavior of physical and biological systems. While extremely valuable to analysts, social network data often describes relationships that are sensitive and sharing the data in full can result in unacceptable disclosures.

Professor Miklau began by motivating the study of social networks with some examples and reviewing some recent developments in anonymous data publishing. Then he described some of the threats to privacy posed by published social networks and our initial attempts to resist these threats. He focused on the threat of structural re-identification in which an individual's local relationships can be identifying, even when their names are protected.

(This talk is based on joint work with Michael Hay and David Jensen.)

Gerome Miklau is an Assistant Professor at the University of Massachusetts, Amherst. His primary research interest is secure data management: providing privacy, confidentiality, and integrity guarantees for data in relational databases and data exchanged on the World Wide Web. He received a NSF CAREER Award in 2007 and won the 2006 ACM SIGMOD Dissertation Award. He received his Ph.D. in Computer Science from the University of Washington in 2005. He earned Bachelor's degrees in Mathematics and in Rhetoric from the University of California, Berkeley, in 1995.

October 12 , 2007

Computer Science Faculty Research Talks

Faculty discussed their ongoing research in Computer Science.

Andrea Danyluk, Chair
Jeannie Albrecht
Duane Bailey
Steve Freund
Brent Heeringa
Morgan McGuie

October 18 , 2007

Co-sponsored by Computer Science and Art Departments

Dr. Donald H. House
Department of Visualization, Texas A&M University

"The Art and Science of Computer Graphics"

The roots of 3D Computer Graphics lie deep in the visual arts, psychology, computer science, physics and mathematics. Donald House explored how these mature disciplines animate this new kid on the block, and are impelling it to become an academic and professional discipline on its own. Work from the highly interdisciplinary graduate programs in Visualization at Texas A&M University will explore the theme. Illustrations will include experimental animated films, as well as research projects in real-time wave simulation and computer directed drawing from 3D models.

October 19, 2007

Sarah E. Calvo `96
Health Sciences and Technology (HST) at the
Broad Institute of MIT/Harvard, Cambridge, MA

"Disease Gene Discovery Through Integrative Genomics"

A parent walks into a health clinic with a very sick 1-yr old child, who presents with seizures, severe muscle weakness, and eyesight difficulties. The child's first cousin presented with related symptoms and died at 9 months, and a another cousin had died similarly at 2 years. How can we figure out what genetic defect is at fault?

In this talk Sarah discussed bioinformatics approaches to discovering genes underlying a class of rare diseases affecting the mitochondrion (the cell's energy producing organelle). Of the ~20,000 genes in the human genome, it's estimated that ~1500 relate to the mitochondrion -- and defects in any of these genes can likely cause rare diseases. Using 7 seven different genomic approaches, she predicts which genes are related to the mitochondria. She combined these 7 methods using a Naive Bayes supervised machine learning algorithm. Using this method, we have successfully discovered mutations underlying one rare mitochondrial disease and predict strong candidates for an additional 7 diseases.

She gave a brief background to the genomic approaches, and focused on several machine learning methodologies to tackle this problem, including Naive Bayes classifiers, support vector machines, and decision trees.


October 26, 2007

Wheeler Ruml, Ph.D.
Department of Computer Science
University of New Hampshire

"Solving Problems Under Time Pressure"

As every student knows, finding the optimal solution to a problem often takes a prohibitively long time. However, in many situations we can do better than simply choosing any feasible answer. Wheeler discussed this trade-off between solution quality and computation time in the context of shortest-path problems, a fundamental task in areas from bioinformatics to robotics.

November 9, 2007

David Irwin, Ph.D. Candidate
Duke University, Durham, NC

"Adaptive Virtual Machine Hosting for Networked Clusters"

An operating system (OS) is a software layer that manages the hardware of a single machine by coordinating its interaction with applications. In this talk, David discussed the design of a new type of OS that manages the hardware of multiple machines spread across the Internet for distributed applications. He based their design on well-studied principles from conventional OS research, and showed that applying the same principles to distributed resource management can address long-standing limitations in systems that manage collections of machines in clusters and data centers.

He described a prototype that manages a cluster by programmatically instantiating virtual machines and storage on-demand for end users and applications. He then discussed how the same system applies more broadly to the management of collections of other hardware devices, including the core routers, switches, and network links that embody the Internet, and the relationship between OS design and recent proposals calling for a ``clean slate" redesign of the existing Internet.

November 30, 2007

Özgür Simsek
Ph.D. Candidate
Department of Computer Science
University of Massachusetts - Amherst

"Acquiring Skill Hierarchies Using Reinforcement Learning"

The broad problem Özgür addressed in this talk is design of autonomous agents that are able to efficiently learn how to achieve desired behaviors in large, complex environments. She focused on one essential design component: the ability to form high-level actions, or skills, from available primitives. She described a characterization of a useful class of skills in terms of general properties of an agent's interaction with its environment---in contrast to specific properties of a particular environment---and introduced methods artificial agents can use to identify and acquire such skills autonomously.

December 7 , 2007

Glencora Borradaile, Ph.D.
Brown University

"Max Flows, Min Cuts and Planar Graphs, oh my!"

We are probably all familiar with the traditional max-flow and min-cut problems. In the former, you want to route as much information as you can between two nodes through a network. In the latter, you want to find a minimum set of edges that disconnects two nodes in the graph. The two problems are closely related and, as you will see in this talk, they were first studied during the Cold War.

Now suppose you want to find many min cuts in the same graph. Can you do this more efficiently than simply computing a min cut for every pair of nodes that you are interested in? It turns out that you can, and we will see a very elegant solution to this problem, dating to the 60s. We will also see how, if your input graph is a planar graph, you can find a particular set of min cuts very, very quickly.

February 7, 2008

Computer Science Class of 1960s Scholars Lecture and Colloquium
Jonathan Schaeffer, Ph.D.
University of Alberta, Edmonton, Alberta, Canada

"Computer (and Human) Perfection at Checkers"

In 1989 the Chinook project began with the goal of winning the human World Checkers Championship. There was an imposing obstacle to success -- the humanchampion, Marion Tinsley. Tinsley was as close to perfection at the game as was humanly possible. To be better than Tinsley meant that the computer had to be perfect. In effect, one had to solve checkers. Little did we know that our quest would take 18 years to complete. What started out as a research project quickly became a personal quest and an emotional roller coaster.

In this talk, the creator of Chinook told the story of man versus machine for supremacy at checkers.

February 8, 2008

Computer Science Class of 1960s Scholars Lecture and Colloquium
Jonathan Schaeffer, Ph.D.
University of Alberta, Edmonton, Alberta, Canada

"Raising the Stakes"

Poker is a challenging problem for artificial intelligence research: multiple opponents (up to 10), stochastic element (cards being dealt), imperfect information (don't know the opponent's cards), deception (bluffing), user modeling (identifying player patterns), and risk management (betting decisions). Unlike the classic AI game, chess, poker is more relevant to real-world situations including negotiations, military strategy, and e-commerce.

For over a decade, the University of Alberta Computer Poker Group has been working on building a high-performance poker program. This work has led them through multiple distinct phases of program design, each new idea "promising" to be the breakthrough to world-class play. Finally, they appear to be close. In a recent Man Versus Machine Match, University of Alberta programs narrowly lost to two world-class players. In this talk we motivate the research, compare the different program designs, and discuss what it will take to raise the stakes in man-machine poker competitions.

February 29

Benjamin Wood `08
Williams College

"Dynamic Synchronization Specification Inference: Why is my program safe?"

With the advent of multi-core processors, concurrency is increasingly important for performance in computing applications. However, concurrency is not without its problems. In programs simultaneously executing many threads of control, subtle synchronization errors such as race conditions may occur. Programmers must manually protect against these errors and mistakes are all too common. Debugging concurrent programs is exceedingly difficult due to the nondeterministic behavior of synchronization errors.

The effectiveness of current tools for debugging and checking concurrent programs is limited by trade-offs in precision, program coverage, and the need for programmer intervention. In this talk Ben evaluated these tools and presented a new approach which aims to increase precision without completely sacrificing program coverage.

March 7

Barath Raghaven
University of California, San Diego


"Overloading the Internet with Decongestion Control"

Loss avoidance has long been central to the Internet architecture. Ordinarily, dropped packets represent wasted resources. In this talk, Barath posit that the benefits of a network architecture that embraces---rather than avoids---widespread packet loss outweigh the potential loss in efficiency. He proposed an alternative approach to Internet congestion control called decongestion control. In a departure from traditional approaches, in this approach, end hosts strive to transmit packets faster than the network can deliver them, leveraging end-to-end erasure coding and in-network fairness enforcement. He argued that such an approach may decrease the complexity of routers while increasing stability, robustness, and tolerance for misbehavior. While a number of important design issues remain open, I’ll show that our approach not only avoids congestion collapse, but delivers near optimal steady-state goodput for a variety of traffic demands in three different backbone topologies.

March 14

Barb Cutler, Ph.D.
Computer Science Department
Rensselaer Polytechnic Institute

"Constrained Planar Remeshing for Architecture

Material limitations and fabrication costs generally run at odds with the creativity of architectural design, producing a wealth of challenging computational geometry problems. We have developed an algorithm for solving an important class of fabrication constraints: those associated with planar construction materials such as glass or plywood. Starting with a complex curved input shape, defined as a NURBS or subdivision surface, we use an iterative clustering method to remesh the surface into planar panels following a cost function that is adjusted by the designer. The algorithm was developed in conjunction with an architectural design seminar at MIT.

April 4

Ileana Streinu, Ph.D.

Professor of Computer Science
and Mathematics
Smith College
Northampton, MA

"Folding Carpenter's Rules, Robot Arms, Proteins: from geometry to combinatorics"

The Carpenter's Rule problem, first appearing in the topology community in the mid '70s and then in Computer Science in the '90's as a robot arm motion planning problem, asks whether every simple planar polygon with fixed edge lengths can be reconfigured continuously between two positions, without producing any self-intersections along the way. The solution is a mixture of ideas from geometry, rigidity theory and polyhedral combinatorics, all leading to a curious (but nice and friendly) object, called a pseudo-triangulation.

A main attraction of this talk is its graphical appeal: every concept I define is elementary, depicted graphically (with lots of two- and three-dimensional props) and easy to understand. The "protein" part of the title leads to the future, to one of the major problems in science today (protein folding): I will conclude telling you what is the connection between folding a robot arm and a protein, and where is this research leading to.

April 11

Oliver Brock, Ph.D.
Assistant Professor, Department of Computer Science
Robotics and Biology Laboratory
Computational Biology Laboratory
University of Massachusetts Amherst

"Robots and Proteins: Two Unlikely Journeys Through High-Dimensional Spaces"

Intelligent, dexterous, and autonomous robots will profoundly alter our everyday lives. But creating these robots has proven to be a much bigger challenge than researchers believed in the early days of AI. In his talk, Professor Brock presented his efforts towards an algorithmic foundation for dexterous and autonomous robots. The behavior of such robots can be captured in high- dimensional spaces. In these spaces, even simple behaviors require computations with provably bad complexity bounds. The fundamental theme of this research is thus the quest for structure in these spaces to alleviate and possibly overcome some of the difficulties encountered in creating robotic autonomy and dexterity.

In the second part of his talk, he discussed a somewhat surprising connection between robots and proteins. Proteins are a fundamentalbuilding block of life; they perform a variety of important functions in the cells of all living beings. A complete understanding of proteins and their functions would revolutionize biology and medicine. But the behavior of proteins, just like that of robots, is the result of many physicochemical phenomena and can only be characterized in high-dimensional spaces. Making progress towards understanding protein function, he argued, depends on the identification of appropriate structure within these high-dimensional spaces.

April 18, 2008

"Inference and Learning with (Social) Network Data"

Foster Provost, Ph.D.
Stern School of Business
New York University

In many applications we would like to draw inferences about entities that are interconnected in complex networks. For example, calls, emails, IM, and web pointers link people into huge social networks. However, traditional statistical and machine learning classification methods assume that entities are independent of each other. In this talk Professor Provost started by discussing various applications of "classification" (scoring) in networked data, from fraud detection to network-based marketing. He then discussed four characteristics of networked data that allow improvements--sometimes substantial--over traditional classification: (i) models can take into account "guilt by association," (ii) inference can be performed "collectively," whereby inferences on linked entities mutually reinforce each other, (iii) characteristics of linked entities can be incorporated in models, and (iv) models can incorporate specific identifiers, such as the identities of particular individuals, to improve inference. He presented results demonstrating the effectiveness of these techniques.

Bio:

Foster Provost is Associate Professor, NEC Faculty Fellow, and Paduano Fellow of Business Ethics at New York University's Stern School of Business. He is Editor-in-Chief of the journal Machine Learning, a founding board member of the International Machine Learning Society, and was program chair of the ACM SIGKDD Conference in 2001. He has received Faculty Awards from IBM and a President's Award from NYNEX Science and Technology. His recent research has focused on inference and learning with network data and utility-based data mining. He has been involved in various applications of machine learning technology to real-world problems, including fraud detection, network diagnosis, customer contact management, targeted marketing, and counterterrorism.

 

April 25, 2008

Brian Levine
Associate Professor
University of Massachusetts Amherst

"Enhancing the Capacity of a Vehicular Network with Throwboxes"

Delay-Tolerant Networking (DTNs) protocols enable routing when mobile nodes are sparsely populated and connect only intermittently. Such scenarios are seen by vehicular networks supported by wireless meshes, sensor deployments for ecological and urban monitoring, networks that attempt to survive large-scale natural disasters, and underwater sensor networks. We have deployed a 40-node DTN operating on a public transit system, called DieselNet, that covers a 150 square mile area sparsely. In this talk, Professor Levine presented a characterization of the network's performance, provided a summary of some projects the testbed has supported, and he detailed one contribution in particular: adding capacity to the DTN with throwboxes.

Throwboxes are stationary, solar-powered nodes with storage and processing, that enhance the capacity of the network by acting as relays for routed data. The use of throwboxes without efficient power management is minimally effective. Our throwbox uses an approximate heuristic for solving the hard problem of meeting an average power constraint while maximizing the number of bytes transferred by the throwbox to passing buses. Our deployment of throwboxes in DieselNet show they are very effective and inexpensive methods of enhancing the capacity of the network.

Biography

Brian Neil Levine is an Associate Professor in the Department of Computer Science at the University of Massachusetts Amherst, which he joined in 1999. He received PhD and Master's degrees from the University of California, Santa Cruz in Computer Engineering. In 1994, he received a B.S. in Computer Science & Applied Math from the University at Albany. In the networking area, his research focuses on mobile systems and peer-to-peer networking. In the security area, his research is focused on privacy and forensics. He was awarded an NSF CAREER grant in 2001 for work in peer-to-peer networking. In 2004, he was awarded a Lilly Teaching Fellowship from UMass Amherst, and in 2007 he was awarded his college's Outstanding Teacher Award. Levine is an associate editor of the IEEE/ACM Transactions on Networking, and he was co-chair of ACM NOSSDAV in 2006.

May 1, 2008

Ben Fry, Ph.D.
2006 - 2007 School of Design Nierenberg Chair
Carnegie Mellon University

"Processing"

What began as a domain-specific extension to Java targeted towards artists and designers has turned into a full-blown design and prototyping tool used for large scale installation work, motion graphics, and complex data visualization. Ben Fry, co-developer of the “Processing” project discussed work being developed by artists, designers, programmers and scientists who make use of Processing. Fry also discussed his own information design and visualization projects that range from genetics to politics to understanding how software works. Ben received his doctoral degree from the Aesthetics + Computation Group at the MIT Media Laboratory, where his research focused on combining fields such as Computer Science, Statistics, Graphic Design, and Data Visualization as a means for understanding complex data.