Visual Media Revolution

CS372T (Q) · Fall 2015
Prof. Morgan McGuire
Williams College · Department of Computer Science

Assignments (negotiable):

Voxels
Implicit Surfaces
Camera Art
Camera Engineering
Head-Mounted Displays
Virtual Reality Avatars
Design for 3D Printing
3D Printing Dynamic Objects
Stylized Rendering
Augmented Reality
Post-Modern Video Games

Alternative assignments requested by students:

Binary Shading
Image Self-Similarity
Optimal Animation
Synthetic Creatures
Unconventional Animation Algorithms
Psycho

We live at the beginning of the second revolution in visual media. Two centuries ago, the camera and the Jacquard loom introduced machines for creating art. By automating the artist's hand, they also forced questions of how objective technique gives rise to subjective meaning and where the border lies between mechanical and human contributions. Those progenitors eventually led to digital film, computer games, and digital content creation for architecture and industrial design.

Today, accessible and pervasive computation provokes a second revolution. Augmented reality, 3D scanning, 3D printing, virtual reality, and computational photography are exploding into mainstream experience. Where previous digital media refined analog practice through evolution, these are revolutionary forms that could not exist without computation.

As the world seeks the promise of new visual forms, we find that fundamentals of earlier media remain valid and take them as our guide. This tutorial investigates the technology of emerging computational media and explores their impact on the relationship between process and aesthetics.

Administrative Catalog Information: This course is a tutorial. It may not be taken on a pass/fail basis or as a fifth course. Requires CS256: Algorithms. The course is capped at 10 and preference will be given to current or expected Computer Science majors per department policy.

Structure: You and your tutorial partner will meet with me once a week. One student will present the assigned reading and the three of us will discuss and debate it. The presentation role will alternate weekly between students, and both students will be graded on performance each week. In some cases, a short essay will also be required from each student.

One goal of the course is to practice presenting technical material carefully. I will evaluate your presentation style and clarity as well as content. Grading will be based on the geometric (not arithmetic) mean of those different elements to encourage you to develop your weakest areas over doubling down on strengths.

There are no mandatory programming exercises and no exams. I anticipate that the average final grade in the course will be 3.5. As this is the first offering, grades will be curved after ranking students to ensure a fair scaling.

Sessions: All meetings in TCL308 by default.

M10: Jamie & Tony
M11: Devin & Kyle
W10: Nigel & Kelly
W11: Lauren & Eli
F10: Kai & Alex

There is a special organizational meeting on Thursday, September 10 at 8:30pm in TCL104B to meet as an entire class, review this syllabus and answer questions about it, and arrange times for future full-group meetings.

Assignments: For each week there is typically one key reading for presentation and discussion. Supplemental readings preceeding it help to provide background if you haven't encountered the topic before. Supplemental ones following it are related works that expand the topic or provide an alternate view. You don't need to read all of the supplemental material. I predict that the top students in the course will regularly reference ideas from followup supplemental material during discussion and that everyone will at least skim the background material for many topics.

We'll use The Graphics Codex as a common textbook for supplemental readings that concisely survey well-understood areas of graphics. I recommend Computer Graphics: Principles and Practice as a comprehensive overview of the mathematics of graphics. CG:PP, the other supplemental books, and the films are on reserve in the library (plan ahead or buy your own copy: there will be a lot of demand). I recommend screening the films as a group because you'll hear each other's reactions during the screening and naturally discuss them afterwards. The games referenced are available in the Unix lab.

Expect to spend up to ten hours per week on the reading and to feel a bit overwhelmed or lost about half of the time. You may--and should--collaborate with your tutorial partner and others in the course without restriction outside of session, so long as you acknowledge their contributions in any written documents.

I list some sample questions for each topic. Although you may, you are not required to explicitly answer these questions or raise them in discussion. Instead, you should use these as guides to focus your initial reading and help determine whether you understand the articles and their impact. When you struggle with the questions, go back to more of the supplemental reading to help provide context. I expect you to go far beyond my initial questions in your full exploration and discussion.

Beware that particularly for a scientific publication, the "pitch" of the paper was tuned for the reviewers at the time of submission. It often does not stand the test of the time in the manner of its actual content. In many cases, the significance of the paper is not its primary claim, at least, in the manner it was presented.

Because tutorial groups will move at slightly different speeds and may follow up with additional readings to dive deeper into some topics, these assignments aren't tied to specific weeks. Proceed through these in order. It is the responsibility of the presenting student to ensure that the partner and professor agree a week beforehand on the topic.

Reading: Do not read a scholarly paper or book like a novel. When studying films and games, you also need to have both the experience of a "normal" audience member and to step back and analyze that experience. You must be an active participant to approach the reading material in this course. For many people (including me), it is generally hard and a bit unpleasant, although there is joy in appreciating it more deeply and discovering insights.

When working with readings other than peer-reviewed scientific publications in this course, I intend that you approach them through the lens of science. Analyze and critique them in light of the related scientific work from that week and other readings in the course. Propose methods for quantifying or automating, or enumerate the tecnical challenges in delivering the vision that they put forth. This is the core of how computational graphics research and development advance: by understanding the goals of authors of visual media and then specifying and creating technology to support them.

It is easy to criticise older works in light of newer ones and modern cultural mores. Such critism is largely a waste of time, so avoid the temptation to take cheap shots. If I assign an older work, it is so that we can learn from their methodology, as well as appreciating the evolution of visual media to predict its future. Ask how important older works innovated relative to their historical contexts, and what they teaches us about how to innovate in our own historical context. This is the most important lesson to take away from the class: how great scientists and artists work, not the specific output that they produced as a result of their time and place.

Scholarship builds on the shoulders of giants. In any scholarly work, probably 90% of the impressive work is actually something that was well known in the field beforehand, 9% is an application of an existing idea in a new way, and maybe 1% is true innovation or new insight. This is often not clear from the text, and you're expected to be familiar enough with the area to understand what is really new.

For example, reading Kajiya's "The Rendering Equation" paper, you might reasonably think that he invented the rendering equation (he didn't) or recursive ray tracing (no)--actually, he's contributing analysis and a crude form of importance sampling. In Jensen's "Global Illumination with Photon Mapping," you might think that the photon tracing and radiance estimation are his contributions. In fact, his contribution is using a k-d tree for the photon map instead of a flat array, and some particular low-precision representations of photons. Those are both really important graphics papers, but not for the reasons that they might appear to be. It is easy to watch Star Wars: A New Hope and find it innovative for effects and style...but much of that is coming from the less-known 1972 Silent Running (which was also laying the ground work for WALL-E, and was following up on Trumbull's FX work on the Kubric film 2001). The 1982 Blade Runner is revisiting many of the themes and visuals from Lang's 1927 Metropolis and 1931 M (all of which are cinematic masterworks!)

For computer science and mathematics papers, I often have to stop and write a little program to test the ideas from the text or fully grasp them. I expect that you may too. Don't look at equations or code lisings and just say, "yes, that looks about right." Ensure that you could reproduce them. Run through the execution of algorithms by hand as if you were a computer.

I usually read the title and abstract, and then jump to the results section. I try to understand how well the technique works from the figures. Then, I double back to the related work section to see how this paper relates to work I'm familiar with. I may then go to the algorithms section or introduction. Once I have a basic idea of what is going on in the paper, I return the the beginning and make a full read of the whole thing.

Expect to read some text, pause, ask yourself a question about it, and try and figure out the next step on your own. When you're reading an equation or looking at a diagram, ask yourself what each symbol means. If you're not quite sure what is going on, don't keep reading--backtrack and read that section again. Maybe out loud. Try reproducing derivations. Be prepared to skip around, refer to supplemental material or cited previous work, and re-read critical passages many, many times.

For film and media criticism, I often have to step back and summarize the author's points and then try to apply them in a new context for evaluation. All professional writers are facile, and it can be easy to let the words roll over one unquestioned. What are the key terms? What are the supports for the argument or claims for the analysis? What are examples outside the text that support it? Can I think of contradictory examples?

For visual works themselves, I find a good approach to analysis is to focus on technique. What other works is this one referencing? Can you find the same shot/mechanic/line/costume/etc. in another work? This scene makes you feel sad--why? Is it sad with the sound off? Would it still be sad if viewed from another angle? How frequent are the cuts? Where are the cameras and lights? Are all of the shots from the same take or spliced from different performances? Make a color script for the work. Make notes on the musical score and effects. Whose point of view is represented? How reliable or unbiased is that viewpoint? Are the actions depicted literal? In a game, why did you make the choices you did? How did you know the rules? What interactions are making you empathize with specific characters?

I'd like to think the following is obvious, but since this may be your first experience studying in multiple media, they may only be obvious points in hindsight. You must first engage each work as it was intended to be experienced. You can't experience a film by reading about it--you must watch it. You can't experience a game by watching someone else play it--you must play it yourself. You can't experience a VR session through a desktop port. You can't experience a sculpture by looking at a picture--you must go see it in person. Etc. If a film was intended for a theatre screen, watch it on the largest screen that you can find (e.g.: Paresky auditorium or any classroom), in a dark environment. Don't force it into a YouTube window while you're chatting on Facebook in a crowded common room.

Presenting: Each week, the presenting student must e-mail an agenda to his or her tutorial partner and me at least 48 hours before our session. This agenda may be in the form of an outline (in ASCII text, HTML, or PDF) or PPT slides (only PowerPoint is allowed). It must include the major points of the presentation, presentation aids or their descriptions, and some key questions or issues for discussion. However, if you're working closely with your partner to create the presentation, then you can send me the slides at the last minute. I'll give you more useful feedback if you send it earlier.

The presenting student will walk us through the reading in about 30 minutes. Available tools include PowerPoint on my laptop, an HDMI 1080p TV, printouts, and a blackboard with colored chalk. Plan for 15 minutes of material, which will be slowed to about 30 minutes by questions. If working with your partner to create the presentation, don't over-polish the content. Figure out a lot of it for yourselves, but also leave openings for us to explore it further in session instead of working everything out. I want to be an active part of the exploration process rather than seeing a perfect and final presentation.

Specifically, you have three good choices when you encounter terms and ideas that you don't understand in the work:

Don't mention these topics (and if I ask, you can refocus the discussion on what you did want to talk about)
Figure out what this topic is and explicitly present it, or at least be prepared to present it if asked
Explicitly mention what you didn't understand (possibly because you didn't have time to trace down this tangent) and then open it for discussion

I prefer the last option because I want the tutorial to be about us figuring out this material together in small groups. Ideally, half of your presentation will be devoted to things you struggeled with and want to dig deeper on. However, topics have varying worthiness for investing our limited discussion time on. What you should not do is mention a topic in passing and hope that your partner and I won't notice. If you've raised an issue in a presentation, you must be prepared to resolve questions about it in some way.

If using slides, I highly recommend cutting and pasting figures, still shots, and equations from the source work. Minimize on-screen text (although you're free to insert text in Presenter Notes) and instead present those ideas orally.

Throughout the presentation, the non-presenting partner should add asides with new information, raise questions about what he or she didn't understand (or the pair didn't understand), and help respond to my questions. Near the beginning of the semester, partners often choose to work as a pair and have equal understanding of the work. But as your other obligations grow during the semester, it is reasonable for the presenting partner to take the lead and use the session to help the other partner understand the work during our session.

The conversation may naturally transition to a free-form discussion of the work and issues that it raises. If it does not, the non-presenting partner should explicitly shift us into a more free-form discussion after about 30 minutes.

Voxels and Filtering

Supplemental Background:

GigaVoxels: Ray-Guided Streaming for Efficient and Detailed Voxel Rendering, Cyril Crassin, Fabrice Neyret, Sylvain Lefebvre, and Elmar Eisemann, I3D, February 2009
Materials [_rn_matrls] chapter and Microfacet Distribution [D] topic in in The Graphics Codex, McGuire, 2015
Physically-Based Shading at Disney, Brent Burley, in Physically Based Shading, SIGGRAPH Courses, August, 2012
Representing Appearance and Pre-filtering Subpixel Data in Sparse Voxel Octrees, Eric Heitz and Fabrice Neyret, HPG, 2012

Reading:

The SGGX Microflake Distribution, Eric Heitz, Jonathan Dupuy, Cyril Crassin, and Carsten Dachsbacher, SIGGRAPH, August 2015

Followup Work:

Interactive Indirect Illumination Using Voxel Cone Tracing, Cyril Crassin, Fabrice Neyret, Miguel Sainz, Simon Green, and Elmar Eisemann, Pacific Graphics 2011
A Survey of Non-linear Pre-filtering Methods for Efficient and Accurate Surface Shading, Eric Bruneton and Fabrice Neyret, TVCG, 2011

Questions and Exercises:

What is an AABB?
What is a voxel?
What is aliasing, in a signal-processing sense?
What is an "anisotropic" material? Give some real-world examples.
Write pseudocode for traversing an (axis-aligned, origin-centered) octree to find the node containing a 3D point.
What is the relationship between the shading normal, the geometric normal, and the microfacet "distribution of normals"?
Why do we want to "prefilter" data before sampling it?
What is the difference between a "microfacet" and a "microflake"?
Make a table expanding the abbreviations from the paper, including NCF, PDF, BRDF, LOD, LEAN, LEADR, VNDF, GPU, and CDF (tip: GGX doesn't really stand for anything).
What is the SGGX representation? Make a pseudocode class definition for a model using it.
1. What are the parameters stored at each leaf? (break this down into the individual scalars)
2. What is the algorithm for filtering the parameters from child nodes to parents?
3. What is the algorithm for interpolating the parameters between adjacent child nodes?
How do you compute a SGGX representation from polygons?
About how long would you expect it to take to trace one primary ray per pixel at 1080p using the SGGX representation? (milliseconds? seconds? minutes? hours?)
What are the limitations of SGGX? What are the new research problems that should be tackled?
The paper focuses on primary visibility and direct illumination. What kinds of bias and aliasing does SGGX not address in the context of a full 3D renderer? What assumptions does it make about the underlying geometry?

Implicit Surfaces and Point-Based Rendering

Supplemental Background:

Surfels: Surface Elements as Rendering Primitives, Hanspeter Pfister, Matthias Zwicker, Jeroen van Baar, and Markus Gross, SIGGRAPH, August 2000
Ray Marching chapter [_rn_rayMrch] in The Graphics Codex, McGuire, 2015
Antialiasing: Are we there yet? (section on temporal antialiasing only), Marco Salvi, Talk in the Open Problems in Real-Time Rendering course at SIGGRAPH 2015, August 2015

Reading:

Learning from Failure: a Survey of Promising, Unconventional and Mostly Abandoned Renderers for `Dreams PS4', a Geometrically Dense, Painterly UGC Game, Alex Evans, in Advances in Real-Time Rendering, SIGGRAPH Courses, August 10, 2015

Followup:

Dreams Trailer, Media Molecule 2015
Dreams Surreal Sandbox Talk, Alex Evans, E3, 2015
Enhanced sphere tracing, Benjamin Keinert, Henry Schäfer, Johann Korndörfer, Urs Ganse, and Marc Stamminger, Smart Tools & Apps for Graphics, 2014
Two Uses of Voxels in LittleBigPlanet2's Graphics Engine, Alex Evans and Anton Kirczenow, in Advances in Real Time Rendering, SIGGRAPH Courses, August 16, 2011
Cascaded Light Propagation Volumes for Real-Time Indirect Illumination, Kaplanyan and Dachsbacher, I3D 2010 / Light Propagation Volumes in CryEngine 3, Kaplanyan, SIGGRAPH 2009 talk

Questions and Exercises:

You'll need to understand the whole talk, but when you present it, I recommend focusing on the final renderer
Who is Alex Evans? What major projects has he worked on? What is his background and bias in computer graphics?
What do players do in this "game"? How do you play?
What is an "implicit surface"?
What is a superellipsoid?
What is z-Brush? 3DS Max?
Make a table expanding the following acronyms and what they mean: CSG, SDF, VB, IB (hint: vertex buffer, index buffer), CS (hint: compute shader), UI, FFD, and GCN.
What are the Marching Cubes algorithm's input and output? How does it work? (This is a great algorithm that uses a lot of clever ideas; everyone should be aware of it)
Quantify and compare the computational resources of the GPUs in PS4, iPhone 6, the highest-end MacBook Pro, and GeForce Titan X. What are the ball-park relative performances? (hint: I used Passmark's charts for a coarse comparison)
Why does Evans abandon polygon modeling and rasterization?
Give example pseudocode for a simple composite implicit surface, such as a snowman constructed from spheres and cylinders.
List the renderers that Evans prototypes, with the key reason each is abandoned.
Why doesn't Evans directly ray-trace the implicit surfaces?
Why is it fortunate that no distance-smear or other nonlocal artist tools were added?
Describe the key data structure in Evans' final renderer, and the orders of growth for computing and rendering from it.
How does Evans animate models?
Write pseudocode for the final renderer, assuming the model data structure has already been constructed.
How is the fur modeled on the polar bear? What makes it look fuzzy?
How does he animate the objects for the trailer(s)?
1. Are they articulated rigid bodies, matrix skinned, or individual SDFs moving around?
2. Is the scene re-voxelized on the fly during animation?
3. How is the skeleton created for the animation?
4. How is motion data entered by the player?
How does he choose the orientation for the textured splats? There's one degree of freedom left since implicit surfaces only define a normal, not the orientation of the tangent basis in the tangent plane. (hint: Bridson)
How does Evans discover the points for the final renderer? Are there really any voxels left at that point, or are they implicit?
What are Hilbert curves, and why are they frequently used in computer graphics data structures?
Why does Evans introduce the point clustering? (Hint: he's performing LOD per-cluster, but that isn't the primary reason to have clusters)

Camera Art

Supplemental Background:

Rear Window, dir: Alfred Hitchcock, film, 1954
Composition + Framing - Storytelling with Cinematography, Simon Cade, Video, 9 min, February 7, 2015
Camera Movement - Storytelling with Cinematography, Simon Cade, Video, 5 min, June 20, 2015
Basic Cinematography, McGuire, 2014

Reading:

The introduction and chapters 4, 5, 8, 12 in Hitchcock, Truffaut, Simon & Schuster, 1985 (originally 1967)
North by Northwest, dir: Alfred Hitchcock, film, 1959

Followup Work:

Sequencing the North by NorthWest Crop Dusting Scene, Barry Ritholtz, May 12th, 2011
Chapters 1, 11, and 15 of Hitchcock, Truffaut, 1985
Psycho, dir: Alfred Hitchcock, film, 1960
High Anxiety, dir: Mel Brooks, film, 1977
Enjoy It, Julia, While It Lasts, Michael J. Lewis, New York Times, May 19, 2005
Grammar of the Film Language, Daniel Arijon, 624 pages, 1976 (1991 reprint)
My Life in Film, Ingmar Bergman, 2011 (originally 1990)
Films of My Life, Truffaut, 1994 (originally 1975)

Questions and Exercises:

List specific scenes from subsequent thriller films that reference shots, sets, or plot elements from North by Northwest. James Bond films are a good place to look.
Enumerate the principles Hitchcock uses for camera framing. What are his unwritten rules?
When and how does the camera move in North by Northwest?
For a few major scenes, write down a shot list. For each, give:
1. Visual and audio transition from previous shot (e.g., cut, wipe, disolve)
2. Notes on transition from previous shot--what parameters were changed, and which were preserved? How did the camera move between the shots? When did the audio transition relative to the video?
3. Approximate vertical field of view
4. Approximate distance from the subject
5. Approximate depth of field
6. Height of the camera off the ground
7. Roll and pitch of the camera
8. Placement of the focus of attention in 2D within the frame
9. Relative heights and sizes of characters as perceived in 2D
10. How does the camera move within the shot?
11. Shot duration
12. Blocking of the main elements (e.g., characters) as both 2D in frame and top-view 3D, showing where the camera, lights, and elements are
Does the exact frame from the movie poster actually appear in the film?
Make a wall-clock timeline of the pacing of events in the crop-duster scene, as perceived by the audience
Whose viewpoint does the film primarily follow? Which character do you identify with? What specific techniques created that empathy?
Is the camera a reliable observer, or are we being shown events as perceived by specific characters?
What does the audience observe or know that the character doesn't? What does the character know that the audience doesn't? (Obviously, this changes over time)
What are the "special effects"? When are we seeing minatures, stunt doubles, forced perspective, or other tricks that create an artificial reality different from what really occured on set?
When are you aware of the camera and lights, and when does the film fade into your own reality?

Camera Engineering

Supplemental Background:

Camera Specifications and Transformations (chapter 13) and Splines and Subdivision Curves (chapter 22), Computer Graphics: Principles and Practice, 3rd Edition
The virtual cinematographer: a paradigm for automatic real-time camera control and directing, Li-wei He, Michael F. Cohen, and David H. Salesin, SIGGRAPH, 1996
Real-Time Cameras, Mark Haigh-Hutchinson, CRC Press, 544 pages, April 14, 2009

Reading:

Camera Control in Computer Graphics, Marc Christie, Patrick Olivier, and Jean-Marie Normand, Computer Graphics Forum, 27(8):2197-2218, 2008
The Last of Us (the PS4 Remastered edition, if you can get it), Naughty Dog, 2014

Followup Work:

ShowMotion: Camera Motion based 3D Design Review, Nicolas Burtnyk, Azam Khan, George Fitzmaurice and Gord Kurtenbach, Proceedings of The ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, 2006
Uncharted 2: Among Thieves, PS3 video game, Naughty Dog, 2009 (in Sawyer)
Scroll Back: The Theory and Practice of Cameras in Side-Scrollers, Itay Keren, Gamasutra, May 2015

Questions and Exercises:

Do Christie et al. propose any new algorithms, or only survey existing ones?
What is INRIA?
What is "cinematography"?
What is the "ray casting" algorithm, and how would it be used for camera control?
What is a "semantic representation"?
Write pseudocode for a Hitchcock-style camera control algorithm to be executed by a human or extremely sophisticated AI. What are the major differences between this and the methods described by Christie et al.?
Shadow of the Colossus and Super Mario 64 were widely regarded in the games industry as having the best cameras of games of the era of the ones discussed in the paper. Why? Are their techniques represented in the paper?
The paper is from 2008; many games have been released since then! How do each of the the following games advance the state of the art for cameras in virtual worlds?
1. Republique
2. Heavy Rain
3. Uncharted 3
4. God of War 3
5. Bayonetta 2
6. The Order: 1886
7. The Last of Us
Real-time 3D programs emulate some real camera effects, such as lens flare, depth of field, chromatic abberation, vignetting, and motion blur...but they currently do so imperfectly. How does this affect cinematography in real-time 3D applications
Rephrase this sentence in your own words: "In practice, even partially automated three-dimensional multimedia generation requires an interpretation and synthesis framework by which both the visuospatial properties of a viewpoint can be computed (i.e. the interpretive framework) and the viewpoint controlled according to the constraints arising from the semantics of the language used (i.e. the synthesis framework)."
Why do the authors claim that camera control is PSPACE-hard? Do you agree, or think it is harder or easier?
What does it mean for a camera model to be "Euler-based"? Why is that name used?
What is "gimbal lock"?
Why is real-time camera control perhaps closer to documentary cinematography than fiction-film cinematography?
Name and define some methods for shooting dialogue.
What are the major camera control algorithm classes?
What are the major direct control methods for cameras?
What are all of the variables in equation 3? What does it signify?
Almost all of the methods described abstract geometry to simple boxes, points, and spheres. Is this reasonable? Is it a major source of error?
Are graphics researchers working on the right problems in cinematography? Are there open problems, or do current approaches cover the space pretty well?
Direct character control is the major motivation for avoiding cuts and certain kinds of camera movements in games. However, recent games are increasingly moving to higher level control of characters--this lends itself to more interesting interactions as well. What are current or hypothetical interaction paradigms that admit more interesting cinematography?
How does the cinematography in the latest video games compare to classic Hollywood cinema? To today's cinema?
How would you incorporate principles from film cinematography into real-time applications?
How does virtual reality change the goal of real-time cinematography?
The paper poses an interesting idea in passing: game cameras are like documentary ones, where they report the action of the game somewhat objectively and with minimal manipulation of the scene. This gives some insight into the constraints on game cameras and where we might look for specific techniques. But it also prompts the question of how objective non-fiction film really is. What are the major differences and "rules" for editorial control in:
1. Cinema Verite
2. Direct Cinema
3. Visual Journalism
4. Documentary
What does it mean for a visual work to accurately depict its subject? Does that preclude editorial control and manipulation? Can that in fact be achieved without editorial control and manipulation?
As a followup to the previous question, perform some cursory research (Wikipedia and IMDB are fine in this case) on the following landmark works:
1. Nanook of the North (1922)
2. Man with a Movie Camera (1929)
3. An American Family (1973)
4. The Thin Blue Line (1988)
5. Bowling for Columbine (2002)
6. The Bridge (2006)
7. The Act of Killing (2012)
Head-Mounted Displays

Supplemental Background:
1. Asynchronous Timewarp Examined, Michael Antonov, March 3, 2015 (blog post)
2. Adaptive frameless rendering, Abhinav Dayal et al., EGSR 2005
3. What VR could, should, and almost certainly will be within two years, Michael Abrash, Steam Developer Days talk 2014
4. Keynote, John Carmack, Oculus Connect 2014 (video)
5. Snowcrash, Neal Stephenson, 480 pages, 1992
6. A billion in computer graphics, McGuire, February 2015 (blog post)
7. No, it's not a "Retina Display", John Hable, blog post May 27, 2011
Reading:
1. Abrash's survey of VR research:
  1. Why virtual reality isn't real to your brain [pt 1] (start reading at "How images get displayed"), Michael Abrash, May 15, 2013 (blog post)
  2. Why virtual reality isn't real to your brain: judder, Michael Abrash, June 20, 2013, (blog post)
  3. Down the VR rabbit hole: Fixing judder, Michael Abrash, July 26, 2013 (blog post)
2. Foveated 3D Graphics, Guenter et al., SIGGRAPH Asia 2012
3. Epic Games and VR, Tim Sweeney, September 24, 2015 (video)
Followup:
1. Mobile VR, John Carmack, GDC talk, 2015 (video)
2. The Most Important Movie of 2015 Is a VR Cartoon About a Hedgehog, Angela Watercutter, Wired, July 2015
3. Oculus Rift Manufacturing Overview, RoadToVR.com, October 2015
4. Unraveling the enigma of Nintendo's virtual boy, 20 years later, Benj Edwards, FastCompany, 2015
5. Cascaded Displays: Spatiotemporal Super-Resolution Using Offset Pixel Layers, Heide et al., SIGGRAPH, 2014
6. Perception of Highlight Disparity at a Distance in Consumer Head-Mounted Displays, Robert Toth, Jon Hasselgren, and Tomas Akenine-Moller, HPG, 2015
7. A Stereo Display Prototype with Multiple Focal Distances, Akeley et al., SIGGRAPH 2004
Questions and Exercises:
1. Who are Kurt Akeley, Tim Sweeney, Michael Abrash, John Carmack, Palmer Luckey, and Inigo Quilez? What are they famous for? What are their apparent goals for computer graphics?
2. Define the following terms:
  1. Human visual system (note constants for typical humans as well):
    1. Presence (the technical term in VR research, not the common definition or VR press "magic feeling")
    2. c.p.d.
    3. i.p.d.
    4. f.o.v.
    5. Fovea
    6. Beta movement
    7. Flicker fusion
    8. Mach band
    9. Phi phenomenon
    10. Judder
    11. Vergence
    12. Accomodation
    13. Scotopic vision
  2. Displays and rendering:
    1. Display lag (latency)
    2. Frame rate
    3. Foveated rendering
    4. Low-persistence
    5. Rolling shutter
    6. Raster
    7. Scanout
    8. Gamma
    9. HDR
    10. Lightfield display
    11. Anaglyph stereo
    12. Presence
3. Familiarize yourself with the following announced products and companies:
  1. Oculus
  2. Gear VR
  3. Magic Leap
  4. Microsoft Hololens
  5. Playstation.VR (a.k.a. Sony's Project Morpheus)
  6. Valve HTC Vive
4. What are the major technical differences between the Oculus CV1 and the Valve Vive?
5. What causes "ghosting"?
6. What are the sources of latency in the VR pipeline?
7. Why is head-mounted display VR harder than CAVE-style VR?
8. What is the role of the head tracker? What is the difference between "outside in" and "inside out" tracking?
9. How does an HMD make objects that are displayed centimeters from the eye appear to be meters away?
10. Why are current HMDs so thick (distance from the eye to the "front" of the display)? What is needed to improve this?
11. Why are current HMDs so tall (distance from the bottom to the top of the display)? What is needed to improve this?
12. What resolution for an HMD would be equivalent to an iMac retina display at 0.5m from the eye? To a movie theatre screen from a center seat?
13. Many HMDs use cell-phone panels rotated to landscape orientation. Do these update horizontally or vertically when used for an HMD?
14. How fast can an OLED be driven? What about an LCD?
15. What bandwidth is required to display RGB8 (24-bit) data at 3840x2160 ("4k") resolution at 120 Hz?
16. What are typical latency and bandwidth values for Bluetooth? For WiFi?
17. Consider the demographic of prominent VR developers
VR Avatars

Supplemental Background:
1. How does the Kinect work?, John MacCormick, Talk at Dickinson College, 2011
2. What is Motion Capture?, VICON homepage, viewed 2015
3. Ready Player One, Ernest Cline, Broadway Books, 400 pages, 2012
4. Avalon, dir: Mamoru Oshii, Film, 2001
5. Real-time physical modelling of character movements with microsoft kinect, Hubert Shum and Edmond S.L. Ho, Proceedings of the 18th ACM symposium on Virtual reality software and technology, 2012
Reading:
1. Real-Time High-Fidelity Facial Performance Capture, Chen Cao, Derek Bradley, Kun Zhou, Thabo Beeler, SIGGRAPH, August 2015
2. Facial Performance Sensing Head-Mounted Display, Hao Li, Laura Trutoiu, Kyle Olszewski, Lingyu Wei, Tristan Trutna, Pei-Lun Hsieh, Aaron Nicholls, and Chongyang Ma, SIGGRAPH, August 2015
Followup Work:
1. Driving High-Resolution Facial Scans with Video Performance Capture, Graham Fyffe, Andrew Jones, Oleg Alexander, Ryosuke Ichikari, and Paul Debevec, ACM Transactions on Graphics 34(1), 2014
2. The Matrix, film, dir: The Wachowskis, 1999
3. eXistenZ, film, dir: David Cronenberg, 1999
4. The papers we read were intentionally focus on tracking to separate it from the rendering problem. For examples of recent real-time rendering of faces, see Jorge Jimenez' work.
Questions and Exercises:
1. What do the following technical terms mean in the context of this research?
  1. Prior (noun)
  2. Blend shapes
  3. Optical flow
  4. Geodesic distance
  5. Difference of Gaussians
  6. Gaussian Mixture Model
  7. Affine
  8. RGB-D
  9. CMOS
  10. IMU
  11. OLED
  12. Regression
  13. Correlation
2. How does a depth camera work? In what situations do they fail?
3. What are the inputs to Li et al. and Cao et al.'s algorithms? How reasonable is it to assume those inputs? I.e., is there another research problem in providing those, or is it viable on today's technology?
4. Both papers use machine learning. How do their uses differ significantly?
5. What is the Iterative Closest Point Algorithm?
6. Li et al. claim that their strain sensor results are on-par with the latest depth sensor systems. Do you agree, after considering Cao et al.'s work?
7. What are the limitations of each system?
8. Could you implement these systems? If not, what else would you need to know?
9. What is the latency of each system? What is its sampling rate?
10. How would you apply either of these techniques to animating a face other than the actor's own? i.e., transferring the animation in real-time
11. Describe how you would build a complete system for virtual face-to-face communication from NY to LA using techniques from these papers. What would the specifications be? What would it cost to produce a prototype?
Design for 3D Printing

Supplemental Background:
1. Upright Orientation of Man-Made Objects, Hongbo Fu, Daniel Cohen-Or, Gideon Dror, Alla Sheffer, SIGGRAPH, 2008
Reading:
1. Chopper: Partitioning Models into 3D-Printable Parts, Linjie Luo, Ilya Baran, Szymon Rusinkiewicz, and Wojciech Matusik, SIGGRAPH Asia, 2012
Followup Work:
1. Notes on the Synthesis of Form, Christopher Alexander, 1964
2. Design and Fabrication by Example, Adriana Schulz, Ariel Shamir, David I.W. Levin, Pitchaya Sitthi-amorn, and Wojciech Matusik, SIGGRAPH, 2014
3. Paramertic Self-supporting Surfaces via Direct Computation of Airy Stress Functions, M. Miki, T. Igarashi. and P. Block, SIGGRAPH, 2015
3D Printing Dynamic Objects

Supplemental Background:
1. Intersection of Convex Objects: The Method of Separating Axes, David Eberly, 2001
2. The Levenberg-Marquardt method for nonlinear least squares curve-fitting problems, Henri P. Gavin, 2013
3. Nonlinear Constrained Optimization: Methods and Software, Sven Leyffer and Ashutosh Mahajan, March 17, 2010
Reading (choose one):
1. Foldabilizing Furniture, Honghua Li, Ruizhen Hu, Ibraheem Alhashim, and Hao Zhang, SIGGRAPH, August 2015
2. Interactive Design of 3D-Printable Robotic Creatures, Megaro et al., SIGGRAPH ASIA 2015
Followup Work:
1. 3D-Printing of Non-Assembly Articulated Models, Cali et al., SIGGRAPH Asia, 2012
2. Computational Design of Mechanical Characters, Stelian Coros, Bernhard Thomaszewski, Gioacchino Noris, Shinjiro Sueda, Moira Forberg, Bob Sumner, Wojciech Matusik and Bernd Bickel, SIGGRAPH, 2013
Stylized Rendering

Supplemental Background:
1. Expressive Rendering, chapter 34 of Computer Graphics: Principles and Practice 3rd Edition
2. Suggestive Contours for Conveying Shape, Doug DeCarlo, Adam Finkelstein, Szymon Rusinkiewicz, and Anthony Santella, SIGGRAPH 2003
3. Highlight Lines for Conveying Shape, Doug DeCarlo and Szymon Rusinkiewicz, NPAR 2007
Reading:
1. Selections from Art and Illusion: A Study of the Psychology of Pictoral Representation, Gombrich, Princeton University Press, 1961
2. Where do People Draw Lines?, Forrester Cole, Aleksey Golovinskiy, Alex Limpaecher, Heather Stoddart Barros, Adam Finkelstein, Thomas Funkhouser, and Szymon Rusinkiewicz, SIGGRAPH, 2008
Followup Work:
1. Image Analogies, Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin, SIGGRAPH, 2001
2. Automated Generation of Interactive 3D Exploded View Diagrams, Wilmot Li, Maneesh Agrawala, Brian Curless, and David Salesin, SIGGRAPH, August 2008
Questions and Exercises:
1. What are the goals of stylized rendering? There are many; there are many different viewpoints in the field about goals and methodology.
2. Prior to this paper, how was line-drawing rendering researched in computer graphics?
3. What is the experiment? Focus on the critical aspects of its design, not just generality. Why are the choices that they made important in the design?
4. How was the gathered data reconciled across subjects, since each person will draw with a different scale, orientation, and internal proportion?
5. What are sources of error in this experiment?
6. Why is this experiment significant for computer graphics?
7. Try the experiment with a small group. Draw a few of the shapes in isolation.
8. Define all of the major curves recognized by computer graphics (e.g., contour) both mathematically and intuitively (you may need to look at prior art to find good definitions).
9. Identify in your own drawings which curves you used, and where and why.
10. Artists are seeking to convey different things in different styles. Are edges conveying shape or value in your drawing? For what applications would different choices be important? (Consider all of the places that line drawings appear)
11. This experiment was a good way of objectively extracting the expertise of artists (although many of the participants don't appear to be very good artists...) However, the scientists then make their own naive subjective statements about the data towards the end of the paper. How could that be improved?
12. I asked professional artists about some of the images. The best answer I received to "How would you draw this?" was..."I wouldn't. I'd draw it from another viewpoint. This is an ambiguous composition.". This answer has deep implications for how we should pursue stylized rendering, some of which are (computationally) unfortunate. Unpack the artist's statement, my claim, and the implications.
13. Human artists learn by a combination of theory and reinforcement learning. What are the corresponding algorithms? How might that be applied in computer graphics?
Augmented Reality

Supplemental Background:
1. Rendering synthetic objects into real scenes, Debevec, SIGGRAPH, 1998
2. Simultaneous Localisation and Mapping (SLAM): Part I The Essential Algorithms, Durrant-Whyte and Bailey, IEEE Robotics & Automation Magazine, 2006
3. Simultaneous Localisation and Mapping (SLAM): Part II State of the Art, Durrant-Whyte and Bailey, IEEE Robotics & Automation Magazine, 2006
4. Simultaneous Localization and Mapping: Literature Survey, Choudhary (unknown)
5. Rendering Synthetic Objects into Legacy Photographs, Karsch et al., SIGGRAPH ASIA 20122
Reading:
1. Automatic Scene Inference for 3D Object Compositing, Karsch et al., ACM Transactions on Graphics, 2014 (watch the video for Karsch's 2011 paper as well)
Followup:
1. AprilTag: A robust and flexible visual fiducial system, Edwin Olson, IEEE Conference on Robotics and Automation, 2011
2. Pinlight Displays: Wide Field of View Augmented Reality Eyeglasses using Defocused Point Light Sources, Maimone et al., SIGGRAPH 2014
3. High-Quality Reflections, Refractions, and Caustics in Augmented Reality and their Contribution to Visual Coherence, Kan et al., ISMAR 2012
Post-Modern Video Games

Supplemental Background:
1. Three Postmodern Games: Self-Reflexive Metacommentary, Kevin Wong, the Artifice, June 2013
2. Half-Life 2, Game
3. Half-Real: Video Games between Real Rules and Fictional Worlds, Juul, MIT Press, 2011
4. The Aesthetic of Play, Upton, MIT Press, 2015
Reading:
1. The Stanley Parable, Davey Wreden, Game, Steam, 2013
2. Irma Vep, dir: Olivier Assayas, Film, 1996
3. Why Hotline Miami is an important game, Rami Ismail, Gamasutra, October 2012
Followup Work:
1. eXistenZ, dir: David Cronenberg, Film, 1999
2. Holy Motors, dir: Leos Carax, Film, 2012
3. Hotline Miami, Jonatan Söderström and Dennis Wedin, Game, Steam, 2012
4. Portal, Valve Corporation, Steam, Game, 2007
5. Works of Game, John Sharp, MIT Press, 2015
Questions and Exercises:
This topic is challenging, inherently, and by design. I want you to still bring a technical perspective and scientifically-grounded arguments, but you'll use those techniques to question our goals themselves.

The overall point is: while computer scientists are trying to solve the problems of rendering 19th and early 20th century paintings (see the expressive graphics topics from this course), creating special effects for films that are anchored the 1970's, and working to build the virtual reality vision articulated in the 1990s, here is what leading artists are doing in mainstream visual media today.

Moreover, the examples chosen are self-reflective, addressing the current art of games and film in a way that is simultaneously optimistic and highly critical. This prompts a number of issues:
1. are we even solving the right technical problems? Maybe nobody wants a human-created oil painting any more, let alone one from a computer.
2. Are there technical problems in current forms that we should be solving?
3. What do new works tell us about the relative importance of technique vs. technology?
4. Are new media technologies such as VR going to advance art, or is technology not where the serious problems are any more? Don't dismiss technology out of hand--the automated loom, new paints, the printing press, the camera, sound recording, and the computer undeniably advanced visual media.
5. How are the analogous criticisms of film and games implicit in these works likely to manifest in new media, such as 3D printing, VR, and AR experiences? Can we minimize those in advance through careful design?
Binary Shading

Supplemental Background:
1. Expressive Rendering, chapter 34 of Computer Graphics: Principles and Practice, 3rd Edition
2. Artistic Thresholding, Jie Xu and Craig S. Kaplan, Proceedings of the ACM SIGGRAPH Symposium on Non-Photorealistic Rendering, 2008
3. Semi-Automatic Stencil Creation Through Error Minimization, Jonathan Bronson, Penny Rheingans and Marc Olano, Proceedings of the ACM SIGGRAPH Symposium on Non-Photorealistic Rendering, 2008
4. Stylized Black and White Images from Photographs, David Mould and Kevin Grant, Proceedings of the ACM SIGGRAPH Symposium on Non-Photorealistic Rendering, 2008
Reading:
1. Renaissance, film, dir: Christian Volckman, 2006
2. Binary Shading using Appearance and Geometry, Bert Buchholz, Tamy Boubekeur, Doug DeCarlo†, and Marc Alexa, Computer Graphics Forum 2010
Followup Work:
1. Sin City, dirs: Frank Miller, Robert Rodriguez, Quentin Tarantino, film, 2005
2. Mad World, PlatinumGames, Sega, video game, 2009
Image Self-Similarity

Supplemental Background:
1. Expressive Rendering, chapter 34 of Computer Graphics: Principles and Practice, 3rd Edition
2. Texture Synthesis by Non-parametric Sampling, Alexei A. Efros and Thomas K. Leung, IEEE International Conference on Computer Vision, September 1999
3. Image Analogies, A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Salesin, SIGGRAPH 2001 Conference Proceedings (or, see the CACM version, which was written for a non-graphics audience)
4. Patch Match: A Randomized Correspondence Algorithm for Structural Image Editing, Connelly Barnes, Eli Shechtman, Adam Finkelstein, Dan B Goldman, SIGGRAPH 2009
5. The Generalized PatchMatch Correspondence Algorithm, Connelly Barnes, Eli Shechtman, Dan B Goldman, Adam Finkelstein, European Conference on Computer Vision, September 2010
Reading:
1. Image Melding: Combining Inconsistent Images using Patch-based Synthesis, Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B Goldman, and Pradeep Sen, SIGGRAPH 2012
Followup Work:
1. Summarizing Visual Data Using Bidirectional Similarity, Denis Simakov, Yaron Caspi, Eli Shechtman, and Michal Irani, Proceedings of CVPR 2008
2. Eigenfaces for Recognition, Matthew Turk andAlex Pentland, Journal of Cognitive Neuroscience, vol.3, no. 1, pp. 71-86, 1991
3. Image Quilting for Texture Synthesis and Transfer, Alexei A. Efros and William T. Freeman, SIGGRAPH 2001
Synthetic Creatures

Supplemental Background:
1. Motion, chapter 35 of Computer Graphics: Principles and Practice 3rd edition
Reading:
Skim all and then pick one to focus on:
1. Evolving Virtual Creatures, Karl Sims, SIGGRAPH 1994
2. Real-time motion retargeting to highly varied user-created morphologies ("the Spore paper"), Hecker et al., SIGGRAPH 2008
3. A New Step for Artificial Creatures, Lassabe, IEEE Artificial Life, 2007
4. Unshackling Evolution: Evolving Soft Robots with Multiple Materials and a Powerful Generative Encoding, Cheney et al., ACM SIGEVO Genetic and Evolutionary Computation Conference, 2013
5. Flexible Muscle-Based Locomotion for Bipedal Creatures, Geijtenbeek et al., ACM SIGGRAPH ASIA 2013
6. Flocks, Herds, and Schools: A Distributed Behavioral Model, Craig Reynolds, SIGGRAPH 1987
Optional two-week programming assignment:
Implement a 2D point mass + massless spring animation system in the style of (the now-defunct) Sodaplay and evolve creatures for a task. Sample tasks are: run fastest, jump highest, cover the most area (pick up rings). The Sodaplay system is nice because it requires only 2D point and spring physics plus collision with planes; that is a lot less complex than rigid body, arbitrary collision, and 3D physics, which lets you focus on the evolution aspect. The only parameters in this system are: topology, spring constant, spring rest length, mass, motorized spring period, motorized spring phase.

Consider what kind of locomotion you want: friction against the ground plane and gravity lead to walking and jumping, friction against the environment and weak gravity leads to swimming or flying, etc.

You might consider the option of some creatures that can have springs to the ground, making them plants. It is easy to extend to creating an ecosystem under this model by either parallel evolution of different classes of creatures with large populations feeding on one another and competing for reasources.

I hypothesize that one reason many evolved creatures look unnatural is that they lack any model of sensors and delicate internal organs. One way to extend this project is with a notion of sensing located at a specific, vulnerable point, as well as some abstract organ cavity. Creatures would ideally evolve to protect their sensitive bits and favor motions that protect and hold high their sensors.

If you're excited about making your creatures look good as well as animation, you might try fixing the topology and then stretching a sprite over them using As-Rigid-As-Possible Mesh Animation. In this way, you could easily make something that looks like an animated cartoon insect or dinosaur.

Optimal Animation

Supplemental Background:
1. Motion, chapter 35 of Computer Graphics: Principles and Practice 3rd edition
2. Physically Based Modeling: Principles and Practice, Andrew Witkin and David Baraff, SIGGRAPH 1997 Course Notes
3. Multiresolution particle-based fluids, Richard Keiser, Bart Adams, Philip Dutre, and Leonidas Guibas, Technical Report, ETH Zurich, 2007
Reading:
1. Synthesizing Physically Realistic Human Motion in Low-Dimensional, Behavior-Specific Spaces, Alla Safonova, Jessica K. Hodgins, and Nancy S. Pollard, SIGGRAPH 2004
2. Highly Adaptive Liquid Simulations on Tetrahedral Meshes, Ryoichi Ando, Nils Thurey, Chris Wojtan, SIGGRAPH 2013
Followup Work:
1. Real-time Eulerian water simulation using a restricted tall cell grid, Nuttapong Chentanez and Matthias Muller, SIGGRAPH 2011
2. Principles of Traditional Animation Applied to 3D Computer Animation, John Lasseter, SIGGRAPH 87
3. Nonconvex Rigid Bodies with Stacking, Eran Guendelman, Robert Bridson, Ronald Fedkiw, SIGGRAPH 2003
4. Realtime style transfer for unlabeled heterogeneous human motion, Shihong Xia, Congyi Wang, Jinxiang Chai, and Jessica Hodgins, SIGGRAPH 2015
5. Sampling Plausible Solutions to Multi-body Constraint Problems, Stephen Chenney and D. A. Forsyth, SIGGRAPH 2000
6. View-Dependent Multiscale Fluid Simulation, Yue Gao, Chen-Feng Li, Bo Ren, and Shi-Min Hu, IEEE TVCG, 2012
Unconventional Animation Algorithms

Supplemental Background:
1. Motion, chapter 35 of Computer Graphics: Principles and Practice 3rd edition
2. Physically Based Modeling: Principles and Practice, Andrew Witkin and David Baraff, SIGGRAPH 1997 Course Notes
Reading (skim all, and then read and discuss one):
1. Cellular Automata for Physical Modelling, Tom Forsyth, Game Programming Gems 3, 2001 (page 200) and The Making Of Dwarf Fortress (page 9), John Harris, Gamasutra, 2008 (focus on the hard case of volume-preserving water; check out recent results in Voxel Quest and speculate about how that system might work)
2. All you need is force: a constraint-based approach for rigid body dynamics in computer animation, Kees van Overveld and Bart Barenbrug, Computer Animation and Simulation 1995
3. Plausible Motion Simulation for Computer Graphics Animation, Ronen Barzel, John F. Hughes, and Daniel N. Wood, Computer Animation and Simulation 1996
4. Meshless Deformations Based on Shape Matching, Matthias Muller, Bruno Heidelberger, Matthias Teschner, and Markus Gross, SIGGRAPH 2005
5. Smooth Rotation Enhanced As-Rigid-As-Possible Mesh Animation, Zohar Levi and Craig Gotsman, IEEE TVCG 2003
Psycho

Supplemental Background:
1. Psycho: A Casebook, Robert Kolker, Oxford 2004
2. Hitchcock's Films Revisited (Psycho chapter), Robin Wood, 2002
3. Relevant content from Hitchcock, Truffaut, 1985
Reading:
1. Psycho, dir: Alfred Hitchcock, film, 1960
2. Psycho Study Guide, Derek Malcolm, Film Education, 1995 You must complete all of the activities from the study guide and bring them to class.
Followup Work:
1. Tools for Placing Cuts and Transitions in Interview Video, Floraine Berthouzoz, Wilmot Li, and Maneesh Agrawala, SIGGRAPH, July 2012
2. Psycho, dir: Gus Van Sant, film, 1998
3. Scorsese on Psycho, excerpt from Hitchcock/Truffaut documentary, 2015

Visual Media Revolution

Voxels and Filtering

Implicit Surfaces and Point-Based Rendering

Camera Art

Camera Engineering

Head-Mounted Displays

VR Avatars

Design for 3D Printing

3D Printing Dynamic Objects

Stylized Rendering

Augmented Reality

Post-Modern Video Games

Binary Shading

Image Self-Similarity

Synthetic Creatures

Optimal Animation

Unconventional Animation Algorithms

Psycho