Visual Media Revolution
CS372T (Q) · Fall 2015
Prof. Morgan McGuire
Williams College · Department of Computer Science
Assignments (negotiable):
- Voxels
- Implicit Surfaces
- Camera Art
- Camera Engineering
- Head-Mounted Displays
- Virtual Reality Avatars
- Design for 3D Printing
- 3D Printing Dynamic Objects
- Stylized Rendering
- Augmented Reality
- Post-Modern Video Games
Alternative assignments requested by students:
- Binary Shading
- Image Self-Similarity
- Optimal Animation
- Synthetic Creatures
- Unconventional Animation Algorithms
- Psycho
We live at the beginning of the second revolution in visual
media. Two centuries ago, the camera and the Jacquard loom
introduced machines for creating art. By automating the artist's
hand, they also forced questions of how objective technique
gives rise to subjective meaning and where the border lies
between mechanical and human contributions. Those progenitors
eventually led to digital film, computer games, and digital
content creation for architecture and industrial design.
Today, accessible and pervasive computation provokes a second
revolution. Augmented reality, 3D scanning, 3D printing, virtual
reality, and computational photography are exploding into
mainstream experience. Where previous digital media refined
analog practice through evolution, these are revolutionary forms
that could not exist without computation.
As the world seeks the promise of
new visual forms, we find that fundamentals of earlier media
remain valid and take them as our guide. This tutorial
investigates the technology of emerging computational media and
explores their impact on the relationship between process and
aesthetics.
Administrative Catalog Information:
This course is a tutorial. It may not be taken on
a pass/fail basis or as a fifth course. Requires CS256:
Algorithms. The course is capped at 10 and preference will be given
to current or expected Computer Science majors per department
policy.
Structure: You and your tutorial partner will meet with
me once a week. One student will present the assigned reading
and the three of us will discuss and debate it. The presentation
role will alternate weekly between students, and both students
will be graded on performance each week. In some cases, a
short essay will also be required from each student.
One goal of the course is to practice presenting technical
material carefully. I will evaluate your presentation style and
clarity as well as content. Grading will be based on the
geometric (not arithmetic) mean of those different elements to
encourage you to develop your weakest areas over doubling down
on strengths.
There are no mandatory programming exercises and no exams. I
anticipate that the average final grade in the course will be
3.5. As this is the first offering, grades will be curved
after ranking students to ensure a fair scaling.
Sessions: All meetings in TCL308 by default.
- M10: Jamie & Tony
- M11: Devin & Kyle
- W10: Nigel & Kelly
- W11: Lauren & Eli
- F10: Kai & Alex
There is a special organizational meeting on
Thursday, September 10 at 8:30pm in TCL104B to meet as an entire class,
review this syllabus and answer questions about it, and arrange times
for future full-group meetings.
Assignments: For each week there is typically one key
reading for presentation and discussion. Supplemental readings
preceeding it help to provide background if you haven't
encountered the topic before. Supplemental ones following it are
related works that expand the topic or provide an alternate
view. You don't need to read all of the supplemental material.
I predict that the top students in the course will regularly
reference ideas from followup supplemental material
during discussion and that everyone will at least skim the background
material for many topics.
We'll use The
Graphics Codex as a common textbook for supplemental
readings that concisely survey well-understood areas of graphics. I recommend
Computer Graphics: Principles and
Practice as a comprehensive overview of the mathematics
of graphics. CG:PP, the other supplemental books, and the
films are on reserve in the library (plan ahead or buy your own
copy: there will be a lot of demand). I recommend screening the
films as a group because you'll hear each other's reactions
during the screening and naturally discuss them afterwards. The
games referenced are available in the Unix lab.
Expect to spend up to ten hours per week on the reading and to
feel a bit overwhelmed or lost about half of the time. You may--and
should--collaborate with your tutorial partner and others in the
course without restriction outside of session, so long as you
acknowledge their contributions in any written documents.
I list some sample questions for each topic. Although you may,
you are not required to explicitly answer these questions or
raise them in discussion. Instead, you should use these as
guides to focus your initial reading and help determine whether
you understand the articles and their impact. When you struggle
with the questions, go back to more of the supplemental reading
to help provide context. I expect you to go far beyond my
initial questions in your full exploration and discussion.
Beware that particularly for a scientific publication, the
"pitch" of the paper was tuned for the reviewers at the time of
submission. It often does not stand the test of the time in the
manner of its actual content. In many cases, the significance of
the paper is not its primary claim, at least, in the manner it
was presented.
Because tutorial groups will move at slightly different speeds
and may follow up with additional readings to dive deeper into
some topics, these assignments aren't tied to specific weeks.
Proceed through these in order. It is the responsibility of the
presenting student to ensure that the partner and professor
agree a week beforehand on the topic.
Reading: Do not read a scholarly paper or book like a
novel. When studying films and games, you also need to
have both the experience of a "normal" audience member and
to step back and analyze that experience.
You must be an active participant to approach
the reading material in this course. For many people
(including me), it is generally hard and a bit unpleasant,
although there is joy in appreciating it more deeply and
discovering insights.
When working with readings other than peer-reviewed scientific
publications in this course, I intend that you approach them
through the lens of science. Analyze and critique them in light
of the related scientific work from that week and other readings
in the course. Propose methods for quantifying or automating, or
enumerate the tecnical challenges in delivering the vision that
they put forth. This is the core of how computational graphics
research and development advance: by understanding the goals of
authors of visual media and then specifying and creating
technology to support them.
It is easy to criticise older works in light of newer ones and
modern cultural mores. Such critism is largely a waste of time,
so avoid the temptation to take cheap shots. If I assign an
older work, it is so that we can learn from their methodology,
as well as appreciating the evolution of visual media to predict
its future. Ask how important older works innovated relative to
their historical contexts, and what they teaches us about how to
innovate in our own historical context. This is the most
important lesson to take away from the class: how great
scientists and artists work, not the specific output that they
produced as a result of their time and place.
Scholarship builds on the shoulders of giants. In any scholarly
work, probably 90% of the impressive work is actually something
that was well known in the field beforehand, 9% is an
application of an existing idea in a new way, and maybe 1% is
true innovation or new insight. This is often not clear from
the text, and you're expected to be familiar enough with the
area to understand what is really new.
For example, reading Kajiya's "The Rendering Equation" paper,
you might reasonably think that he invented the rendering
equation (he didn't) or recursive ray tracing (no)--actually,
he's contributing analysis and a crude form of importance
sampling. In Jensen's "Global Illumination with Photon Mapping,"
you might think that the photon tracing and radiance estimation
are his contributions. In fact, his contribution is using
a k-d tree for the photon map instead of a flat array,
and some particular low-precision representations of
photons. Those are both really important graphics papers, but
not for the reasons that they might appear to be. It is easy to
watch Star Wars: A New Hope and find it innovative for
effects and style...but much of that is coming from the
less-known 1972 Silent Running (which was also laying the
ground work for WALL-E, and was following up on
Trumbull's FX work on the Kubric film 2001). The 1982 Blade Runner is
revisiting many of the themes and visuals from Lang's
1927 Metropolis and 1931 M (all of which are
cinematic masterworks!)
For computer science and mathematics papers, I often have to
stop and write a little program to test the ideas from the text
or fully grasp them. I expect that you may too. Don't look at
equations or code lisings and just say, "yes, that looks about
right." Ensure that you could reproduce them. Run through the
execution of algorithms by hand as if you were a computer.
I usually read the title and abstract, and
then jump to the results section. I try to understand how well
the technique works from the figures. Then, I double back to the
related work section to see how this paper relates to work I'm
familiar with. I may then go to the algorithms section or
introduction. Once I have a basic idea of what is going on in
the paper, I return the the beginning and make a full read of
the whole thing.
Expect to read some text, pause, ask yourself a question about
it, and try and figure out the next step on your own. When
you're reading an equation or looking at a diagram, ask yourself
what each symbol means. If you're not quite sure what is going
on, don't keep reading--backtrack and read that section
again. Maybe out loud. Try reproducing derivations. Be prepared
to skip around, refer to supplemental material or cited previous
work, and re-read critical passages many, many times.
For film and media criticism, I often have to step back and
summarize the author's points and then try to apply them in a
new context for evaluation. All professional writers are facile,
and it can be easy to let the words roll over one unquestioned.
What are the key terms? What are the
supports for the argument or claims for the analysis? What
are examples outside the text that support it? Can I think of
contradictory examples?
For visual works themselves, I find a good approach to
analysis is to focus on technique. What other works is this one
referencing? Can you find the same
shot/mechanic/line/costume/etc. in another work? This scene
makes you feel sad--why? Is it sad with the sound off? Would it
still be sad if viewed from another angle? How frequent are the
cuts? Where are the cameras and lights? Are all of the shots
from the same take or spliced from different performances? Make
a color script for the work. Make notes on the musical score and
effects. Whose point of view is represented? How reliable or
unbiased is that viewpoint? Are the actions depicted literal?
In a game, why did you make the choices you did? How did you
know the rules? What interactions are making you empathize with
specific characters?
I'd like to think the following is obvious, but since this may
be your first experience studying in multiple media, they may
only be obvious points in hindsight. You must first engage
each work as it was intended to be experienced. You can't
experience a film by reading about it--you must watch it. You
can't experience a game by watching someone else play it--you
must play it yourself. You can't experience a VR session through
a desktop port. You can't experience a sculpture by looking at a
picture--you must go see it in person. Etc. If a film was
intended for a theatre screen, watch it on the largest screen
that you can find (e.g.: Paresky auditorium or any classroom),
in a dark environment. Don't force it into a YouTube window
while you're chatting on Facebook in a crowded common room.
Presenting: Each week, the presenting student must e-mail
an agenda to his or her tutorial partner and me at least 48
hours before our session. This agenda may be in the form of an
outline (in ASCII text, HTML, or PDF) or PPT slides (only
PowerPoint is allowed). It must include the major points of the
presentation, presentation aids or their descriptions, and some
key questions or issues for discussion. However, if you're
working closely with your partner to create the presentation,
then you can send me the slides at the last minute. I'll give you
more useful feedback if you send it earlier.
The presenting student will walk us through the reading in about
30 minutes. Available tools include PowerPoint on my laptop, an
HDMI 1080p TV, printouts, and a blackboard with colored chalk.
Plan for 15 minutes of material, which will be slowed to about
30 minutes by questions. If working with your partner to create
the presentation, don't over-polish the content. Figure out a
lot of it for yourselves, but also leave openings for us to
explore it further in session instead of working everything
out. I want to be an active part of the exploration process
rather than seeing a perfect and final presentation.
Specifically, you have three good choices when you encounter
terms and ideas that you don't understand in the work:
- Don't mention these topics (and if I ask,
you can refocus the discussion on what you did want to talk about)
- Figure out what this topic is and explicitly present it, or at least
be prepared to present it if asked
- Explicitly mention what you didn't understand (possibly because you
didn't have time to trace down this tangent) and then open it for discussion
I prefer the last option because I want the tutorial to be about
us figuring out this material together in small groups. Ideally,
half of your presentation will be devoted to things you
struggeled with and want to dig deeper on. However, topics have
varying worthiness for investing our limited discussion time
on. What you should not do is mention a topic in passing and
hope that your partner and I won't notice. If you've raised an
issue in a presentation, you must be prepared to resolve
questions about it in some way.
If using slides, I highly recommend cutting and pasting figures,
still shots, and equations from the source work. Minimize
on-screen text (although you're free to insert text in Presenter
Notes) and instead present those ideas orally.
Throughout the presentation, the non-presenting partner should
add asides with new information, raise questions about what he
or she didn't understand (or the pair didn't understand), and
help respond to my questions. Near the beginning of the
semester, partners often choose to work as a pair and have equal
understanding of the work. But as your other obligations grow during
the semester, it is reasonable for the presenting partner to take the lead and
use the session to help the other partner understand the work during
our session.
The conversation may naturally transition to a free-form
discussion of the work and issues that it raises. If it does
not, the non-presenting partner should explicitly shift us
into a more free-form discussion after about 30 minutes.
Voxels and Filtering
Supplemental Background:
- GigaVoxels: Ray-Guided Streaming for Efficient and Detailed Voxel Rendering, Cyril Crassin, Fabrice Neyret, Sylvain Lefebvre, and Elmar Eisemann, I3D, February 2009
- Materials [_rn_matrls] chapter and Microfacet Distribution [D] topic in in The Graphics Codex, McGuire, 2015
- Physically-Based Shading at Disney, Brent Burley, in Physically Based Shading, SIGGRAPH Courses, August, 2012
- Representing Appearance and Pre-filtering Subpixel Data in
Sparse Voxel Octrees, Eric Heitz and Fabrice Neyret, HPG, 2012
Reading:
- The SGGX Microflake Distribution, Eric Heitz, Jonathan Dupuy, Cyril Crassin, and Carsten Dachsbacher, SIGGRAPH, August 2015
Followup Work:
- Interactive Indirect Illumination Using Voxel Cone Tracing, Cyril Crassin, Fabrice Neyret, Miguel Sainz, Simon Green, and Elmar Eisemann, Pacific Graphics 2011
- A Survey of Non-linear Pre-filtering Methods
for Efficient and Accurate Surface Shading, Eric Bruneton and Fabrice Neyret, TVCG, 2011
Questions and Exercises:
- What is an AABB?
- What is a voxel?
- What is aliasing, in a signal-processing sense?
- What is an "anisotropic" material? Give some real-world examples.
- Write pseudocode for traversing an (axis-aligned, origin-centered) octree to find the node containing a 3D point.
- What is the relationship between the shading normal, the geometric normal, and the microfacet "distribution of normals"?
- Why do we want to "prefilter" data before sampling it?
- What is the difference between a "microfacet" and a "microflake"?
- Make a table expanding the abbreviations from the paper, including NCF, PDF, BRDF, LOD, LEAN, LEADR, VNDF, GPU, and CDF (tip: GGX doesn't really stand for anything).
- What is the SGGX representation? Make a pseudocode class definition for a model using it.
- What are the parameters stored at each leaf? (break this down into the individual scalars)
- What is the algorithm for filtering the parameters from child nodes to parents?
- What is the algorithm for interpolating the parameters between adjacent child nodes?
- How do you compute a SGGX representation from polygons?
- About how long would you expect it to take to trace one primary ray per pixel at 1080p using the SGGX representation? (milliseconds? seconds? minutes? hours?)
- What are the limitations of SGGX? What are the new research problems that should be tackled?
- The paper focuses on primary visibility and direct illumination. What kinds of bias and aliasing does SGGX not address in the context of a full 3D renderer? What assumptions does it make about the underlying geometry?
Implicit Surfaces and Point-Based Rendering
Supplemental Background:
- Surfels: Surface Elements as Rendering Primitives, Hanspeter Pfister, Matthias Zwicker, Jeroen van Baar, and Markus Gross, SIGGRAPH, August 2000
- Ray Marching chapter [_rn_rayMrch] in The Graphics Codex, McGuire, 2015
- Antialiasing: Are we there yet? (section on temporal antialiasing only), Marco Salvi, Talk in the Open Problems in Real-Time Rendering course at SIGGRAPH 2015, August 2015
Reading:
- Learning from Failure: a Survey of Promising, Unconventional and Mostly Abandoned Renderers for `Dreams PS4', a Geometrically Dense, Painterly UGC Game, Alex Evans, in Advances in Real-Time Rendering, SIGGRAPH Courses, August 10, 2015
Followup:
- Dreams Trailer, Media Molecule 2015
- Dreams Surreal Sandbox Talk, Alex Evans, E3, 2015
- Enhanced sphere tracing, Benjamin Keinert, Henry Schäfer, Johann Korndörfer, Urs Ganse, and Marc Stamminger, Smart Tools & Apps for Graphics, 2014
- Two Uses of Voxels in LittleBigPlanet2's Graphics Engine, Alex Evans and Anton Kirczenow, in Advances in Real Time Rendering, SIGGRAPH Courses, August 16, 2011
- Cascaded Light Propagation Volumes for Real-Time Indirect Illumination, Kaplanyan and Dachsbacher, I3D 2010 / Light Propagation Volumes in CryEngine 3, Kaplanyan, SIGGRAPH 2009 talk
Questions and Exercises:
- You'll need to understand the whole talk, but when you present it, I recommend focusing on the final renderer
- Who is Alex Evans? What major projects has he worked on? What is his background and bias in computer graphics?
- What do players do in this "game"? How do you play?
- What is an "implicit surface"?
- What is a superellipsoid?
- What is z-Brush? 3DS Max?
- Make a table expanding the following acronyms and what they mean: CSG, SDF, VB, IB (hint: vertex buffer, index buffer), CS (hint: compute shader), UI, FFD, and GCN.
- What are the Marching Cubes algorithm's input and output? How does it work? (This is a great algorithm that uses a lot of clever ideas; everyone should be aware of it)
- Quantify and compare the computational resources of the GPUs in PS4, iPhone 6, the highest-end MacBook Pro, and GeForce Titan X. What are the ball-park relative performances? (hint: I used Passmark's charts for a coarse comparison)
- Why does Evans abandon polygon modeling and rasterization?
- Give example pseudocode for a simple composite implicit surface, such as a snowman constructed from spheres and cylinders.
- List the renderers that Evans prototypes, with the key reason each is abandoned.
- Why doesn't Evans directly ray-trace the implicit surfaces?
- Why is it fortunate that no distance-smear or other nonlocal artist tools were added?
- Describe the key data structure in Evans' final renderer, and the orders of growth for computing and rendering from it.
- How does Evans animate models?
- Write pseudocode for the final renderer, assuming the model data structure has already been constructed.
- How is the fur modeled on the polar bear? What makes it look fuzzy?
- How does he animate the objects for the trailer(s)?
- Are they articulated rigid bodies, matrix skinned, or individual SDFs moving around?
- Is the scene re-voxelized on the fly during animation?
- How is the skeleton created for the animation?
- How is motion data entered by the player?
- How does he choose the orientation for the textured splats? There's one degree of freedom left since implicit surfaces only define a normal, not the orientation of the tangent basis in the tangent plane. (hint: Bridson)
- How does Evans discover the points for the final renderer? Are there really any voxels left at that point, or are they implicit?
- What are Hilbert curves, and why are they frequently used in computer graphics data structures?
- Why does Evans introduce the point clustering? (Hint: he's performing LOD per-cluster, but that isn't the primary reason to have clusters)
Camera Art
Supplemental Background:
- Rear Window, dir: Alfred Hitchcock, film, 1954
- Composition + Framing - Storytelling with Cinematography, Simon Cade, Video, 9 min, February 7, 2015
- Camera Movement - Storytelling with Cinematography, Simon Cade, Video, 5 min, June 20, 2015
- Basic Cinematography, McGuire, 2014
Reading:
- The introduction and chapters 4, 5, 8, 12 in Hitchcock, Truffaut, Simon & Schuster, 1985 (originally 1967)
- North by Northwest, dir: Alfred Hitchcock, film, 1959
Followup Work:
- Sequencing the North by NorthWest Crop Dusting Scene, Barry Ritholtz, May 12th, 2011
- Chapters 1, 11, and 15 of Hitchcock, Truffaut, 1985
- Psycho, dir: Alfred Hitchcock, film, 1960
- High Anxiety, dir: Mel Brooks, film, 1977
- Enjoy It, Julia, While It Lasts, Michael J. Lewis, New York Times, May 19, 2005
- Grammar of the Film Language, Daniel Arijon, 624 pages, 1976 (1991 reprint)
- My Life in Film, Ingmar Bergman, 2011 (originally 1990)
- Films of My Life, Truffaut, 1994 (originally 1975)
Questions and Exercises:
- List specific scenes from subsequent thriller films that reference shots, sets, or plot elements from North by Northwest. James Bond films are a good place to look.
- Enumerate the principles Hitchcock uses for camera framing. What are his unwritten rules?
- When and how does the camera move in North by Northwest?
- For a few major scenes, write down a shot list. For each, give:
- Visual and audio transition from previous shot (e.g., cut, wipe, disolve)
- Notes on transition from previous shot--what parameters were changed, and which were preserved? How did the camera move between the shots? When did the audio transition relative to the video?
- Approximate vertical field of view
- Approximate distance from the subject
- Approximate depth of field
- Height of the camera off the ground
- Roll and pitch of the camera
- Placement of the focus of attention in 2D within the frame
- Relative heights and sizes of characters as perceived in 2D
- How does the camera move within the shot?
- Shot duration
- Blocking of the main elements (e.g., characters) as both 2D in frame and top-view 3D, showing where the camera, lights, and elements are
- Does the exact frame from the movie poster actually appear in the film?
- Make a wall-clock timeline of the pacing of events in the crop-duster scene, as perceived by the audience
- Whose viewpoint does the film primarily follow? Which character do you identify with? What specific techniques created that empathy?
- Is the camera a reliable observer, or are we being shown events as perceived by specific characters?
- What does the audience observe or know that the character doesn't? What does the character know that the audience doesn't? (Obviously, this changes over time)
- What are the "special effects"? When are we seeing minatures, stunt doubles, forced perspective, or other tricks that create an artificial reality different from what really occured on set?
- When are you aware of the camera and lights, and when does the film fade into your own reality?
Camera Engineering
Supplemental Background:
- Camera Specifications and Transformations (chapter 13) and Splines and Subdivision Curves (chapter 22), Computer Graphics: Principles and Practice, 3rd Edition
- The virtual cinematographer: a paradigm for automatic real-time camera control and directing, Li-wei He, Michael F. Cohen, and David H. Salesin, SIGGRAPH, 1996
- Real-Time Cameras, Mark Haigh-Hutchinson, CRC Press, 544 pages, April 14, 2009
Reading:
- Camera Control in Computer Graphics, Marc Christie, Patrick Olivier, and Jean-Marie Normand, Computer Graphics Forum, 27(8):2197-2218, 2008
- The Last of Us (the PS4 Remastered edition, if you can get it), Naughty Dog, 2014
Followup Work:
- ShowMotion: Camera Motion based 3D Design Review, Nicolas Burtnyk, Azam Khan, George Fitzmaurice and Gord Kurtenbach, Proceedings of The ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, 2006
- Uncharted 2: Among Thieves, PS3 video game, Naughty Dog, 2009 (in Sawyer)
- Scroll Back: The Theory and Practice of Cameras in Side-Scrollers, Itay Keren, Gamasutra, May 2015
Questions and Exercises:
- Do Christie et al. propose any new algorithms, or only survey existing ones?
- What is INRIA?
- What is "cinematography"?
- What is the "ray casting" algorithm, and how would it be used for camera control?
- What is a "semantic representation"?
- Write pseudocode for a Hitchcock-style camera control algorithm to be executed by a human or extremely sophisticated AI. What are the major differences between this and the methods described by Christie et al.?
- Shadow of the Colossus and Super Mario 64 were widely regarded in the games industry as having the best cameras of games of the era of the ones discussed in the paper. Why? Are their techniques represented in the paper?
- The paper is from 2008; many games have been released
since then! How do each of the the following games advance
the state of the art for cameras in virtual worlds?
- Republique
- Heavy Rain
- Uncharted 3
- God of War 3
- Bayonetta 2
- The Order: 1886
- The Last of Us
- Real-time 3D programs emulate some real camera
effects, such as lens flare, depth of field, chromatic
abberation, vignetting, and motion blur...but they
currently do so imperfectly. How does this affect
cinematography in real-time 3D applications
- Rephrase this sentence in your own words:
"In practice, even partially automated three-dimensional
multimedia generation requires an interpretation and synthesis
framework by which both the visuospatial properties of a
viewpoint can be computed (i.e. the interpretive framework)
and the viewpoint controlled according to the constraints arising
from the semantics of the language used (i.e. the synthesis
framework)."
- Why do the authors claim that camera control is PSPACE-hard?
Do you agree, or think it is harder or easier?
- What does it mean for a camera model to be "Euler-based"? Why is that name used?
- What is "gimbal lock"?
- Why is real-time camera control perhaps closer to documentary cinematography than
fiction-film cinematography?
- Name and define some methods for shooting dialogue.
- What are the major camera control algorithm classes?
- What are the major direct control methods for cameras?
- What are all of the variables in equation 3? What does it signify?
- Almost all of the methods described abstract geometry to simple boxes, points, and spheres. Is this reasonable? Is it a major source of error?
- Are graphics researchers working on the right problems in cinematography? Are there open problems, or do current approaches cover the space pretty well?
- Direct character control is the major motivation for avoiding cuts and certain kinds of camera movements in games. However, recent games are increasingly moving to higher level control of characters--this lends itself to more interesting interactions as well. What are current or hypothetical interaction paradigms that admit more interesting cinematography?
- How does the cinematography in the latest video games compare to classic Hollywood cinema? To today's cinema?
- How would you incorporate principles from film cinematography into real-time applications?
- How does virtual reality change the goal of real-time cinematography?
- The paper poses an interesting idea in passing: game cameras are like documentary ones, where they report the action of the game somewhat objectively and with minimal manipulation of the scene. This gives some insight into the constraints on game cameras and where we might look for specific techniques. But it also prompts the question of how objective non-fiction film really is. What are the major differences and "rules" for editorial control in:
- Cinema Verite
- Direct Cinema
- Visual Journalism
- Documentary
What does it mean for a visual work to accurately depict its subject? Does that preclude editorial control and manipulation? Can that in fact be achieved without editorial control and manipulation?
- As a followup to the previous question, perform some cursory research (Wikipedia and IMDB are fine in this case) on the following landmark works:
- Nanook of the North (1922)
- Man with a Movie Camera (1929)
- An American Family (1973)
- The Thin Blue Line (1988)
- Bowling for Columbine (2002)
- The Bridge (2006)
- The Act of Killing (2012)
Head-Mounted Displays
Supplemental Background:
- Asynchronous Timewarp Examined, Michael Antonov, March 3, 2015 (blog post)
- Adaptive frameless rendering, Abhinav Dayal et al., EGSR 2005
- What VR could, should, and almost certainly will be within two years, Michael Abrash, Steam Developer Days talk 2014
- Keynote, John Carmack, Oculus Connect 2014 (video)
- Snowcrash, Neal Stephenson, 480 pages, 1992
- A billion in computer graphics, McGuire, February 2015 (blog post)
- No, it's not a "Retina Display", John Hable, blog post May 27, 2011
Reading:
- Abrash's survey of VR research:
- Why virtual reality isn't real to your brain [pt 1] (start reading at "How images get displayed"), Michael Abrash, May 15, 2013 (blog post)
- Why virtual reality isn't real to your brain: judder, Michael Abrash, June 20, 2013, (blog post)
- Down the VR rabbit hole: Fixing judder, Michael Abrash, July 26, 2013 (blog post)
- Foveated 3D Graphics, Guenter et al., SIGGRAPH Asia 2012
- Epic Games and VR, Tim Sweeney, September 24, 2015 (video)
Followup:
- Mobile VR, John Carmack, GDC talk, 2015 (video)
- The Most Important Movie of 2015 Is a VR Cartoon About a Hedgehog, Angela Watercutter, Wired, July 2015
- Oculus Rift Manufacturing Overview, RoadToVR.com, October 2015
- Unraveling the enigma of Nintendo's virtual boy, 20 years later, Benj Edwards, FastCompany, 2015
- Cascaded Displays: Spatiotemporal Super-Resolution Using Offset Pixel Layers, Heide et al., SIGGRAPH, 2014
- Perception of Highlight Disparity at a Distance in Consumer Head-Mounted Displays, Robert Toth, Jon Hasselgren, and Tomas Akenine-Moller, HPG, 2015
- A Stereo Display Prototype with Multiple Focal Distances, Akeley et al., SIGGRAPH 2004
Questions and Exercises:
- Who are Kurt Akeley, Tim Sweeney, Michael Abrash, John Carmack, Palmer Luckey, and Inigo Quilez? What are they famous for? What are their apparent goals for computer graphics?
- Define the following terms:
- Human visual system (note constants for typical humans as well):
- Presence (the technical term in VR research, not the common definition or VR press "magic feeling")
- c.p.d.
- i.p.d.
- f.o.v.
- Fovea
- Beta movement
- Flicker fusion
- Mach band
- Phi phenomenon
- Judder
- Vergence
- Accomodation
- Scotopic vision
- Displays and rendering:
- Display lag (latency)
- Frame rate
- Foveated rendering
- Low-persistence
- Rolling shutter
- Raster
- Scanout
- Gamma
- HDR
- Lightfield display
- Anaglyph stereo
- Presence
- Familiarize yourself with the following announced products and companies:
- Oculus
- Gear VR
- Magic Leap
- Microsoft Hololens
- Playstation.VR (a.k.a. Sony's Project Morpheus)
- Valve HTC Vive
- What are the major technical differences between the Oculus CV1 and the Valve Vive?
- What causes "ghosting"?
- What are the sources of latency in the VR pipeline?
- Why is head-mounted display VR harder than CAVE-style VR?
- What is the role of the head tracker? What is the difference between "outside in" and "inside out" tracking?
- How does an HMD make objects that are displayed centimeters from the eye appear to be meters away?
- Why are current HMDs so thick (distance from the eye to the "front" of the display)? What is needed to improve this?
- Why are current HMDs so tall (distance from the bottom to the top of the display)? What is needed to improve this?
- What resolution for an HMD would be equivalent to an iMac retina display at 0.5m from the eye? To a movie theatre screen from a center seat?
- Many HMDs use cell-phone panels rotated to landscape orientation. Do these update horizontally or vertically when used for an HMD?
- How fast can an OLED be driven? What about an LCD?
- What bandwidth is required to display RGB8 (24-bit) data at 3840x2160 ("4k") resolution at 120 Hz?
- What are typical latency and bandwidth values for Bluetooth? For WiFi?
- Consider the demographic of prominent VR developers
VR Avatars
Supplemental Background:
- How does the Kinect work?, John MacCormick, Talk at Dickinson College, 2011
- What is Motion Capture?, VICON homepage, viewed 2015
- Ready Player One, Ernest Cline, Broadway Books, 400 pages, 2012
- Avalon, dir: Mamoru Oshii, Film, 2001
- Real-time physical modelling of character movements with microsoft kinect, Hubert Shum and Edmond S.L. Ho, Proceedings of the 18th ACM symposium on Virtual reality software and technology, 2012
Reading:
- Real-Time High-Fidelity Facial Performance Capture, Chen Cao, Derek Bradley, Kun Zhou, Thabo Beeler, SIGGRAPH, August 2015
- Facial Performance Sensing Head-Mounted Display, Hao Li, Laura Trutoiu, Kyle Olszewski, Lingyu Wei, Tristan Trutna, Pei-Lun Hsieh, Aaron Nicholls, and Chongyang Ma, SIGGRAPH, August 2015
Followup Work:
- Driving High-Resolution Facial Scans with Video Performance Capture, Graham Fyffe, Andrew Jones, Oleg Alexander, Ryosuke Ichikari,
and Paul Debevec, ACM Transactions on Graphics 34(1), 2014
- The Matrix, film, dir: The Wachowskis, 1999
- eXistenZ, film, dir: David Cronenberg, 1999
- The papers we read were intentionally focus on tracking to separate it from the rendering problem. For examples of recent real-time rendering of faces, see Jorge Jimenez' work.
Questions and Exercises:
- What do the following technical terms mean in the context
of this research?
- Prior (noun)
- Blend shapes
- Optical flow
- Geodesic distance
- Difference of Gaussians
- Gaussian Mixture Model
- Affine
- RGB-D
- CMOS
- IMU
- OLED
- Regression
- Correlation
- How does a depth camera work? In what situations do they fail?
- What are the inputs to Li et al. and Cao et al.'s algorithms? How reasonable is
it to assume those inputs? I.e., is there another research problem in providing those,
or is it viable on today's technology?
- Both papers use machine learning. How do their uses differ significantly?
- What is the Iterative Closest Point Algorithm?
- Li et al. claim that their strain sensor results are on-par with the latest
depth sensor systems. Do you agree, after considering Cao et al.'s work?
- What are the limitations of each system?
- Could you implement these systems? If not, what else would you
need to know?
- What is the latency of each system? What is its sampling rate?
- How would you apply either of these techniques
to animating a face other than the actor's own? i.e., transferring
the animation in real-time
- Describe how you would build a complete system for
virtual face-to-face communication from NY to LA using techniques from
these papers. What would the specifications be? What would it cost
to produce a prototype?
Design for 3D Printing
Supplemental Background:
- Upright Orientation of Man-Made Objects, Hongbo Fu, Daniel Cohen-Or, Gideon Dror, Alla Sheffer, SIGGRAPH, 2008
Reading:
- Chopper: Partitioning Models into 3D-Printable Parts, Linjie Luo, Ilya Baran, Szymon Rusinkiewicz, and Wojciech Matusik, SIGGRAPH Asia, 2012
Followup Work:
- Notes on the Synthesis of Form, Christopher Alexander, 1964
- Design and Fabrication by Example, Adriana Schulz, Ariel Shamir, David I.W. Levin, Pitchaya Sitthi-amorn, and Wojciech Matusik, SIGGRAPH, 2014
- Paramertic Self-supporting Surfaces via Direct Computation of Airy Stress Functions, M. Miki, T. Igarashi. and P. Block, SIGGRAPH, 2015
3D Printing Dynamic Objects
Supplemental Background:
- Intersection of Convex Objects: The Method of
Separating Axes, David Eberly, 2001
- The Levenberg-Marquardt method for nonlinear least squares curve-fitting problems, Henri P. Gavin, 2013
- Nonlinear Constrained Optimization: Methods and Software, Sven Leyffer and Ashutosh Mahajan, March 17, 2010
Reading (choose one):
- Foldabilizing Furniture, Honghua Li, Ruizhen Hu, Ibraheem Alhashim, and Hao Zhang, SIGGRAPH, August 2015
- Interactive Design of 3D-Printable Robotic Creatures, Megaro et al., SIGGRAPH ASIA 2015
Followup Work:
- 3D-Printing of Non-Assembly Articulated Models, Cali et al., SIGGRAPH Asia, 2012
- Computational Design of Mechanical Characters, Stelian Coros, Bernhard Thomaszewski, Gioacchino Noris, Shinjiro Sueda, Moira Forberg, Bob Sumner, Wojciech Matusik and Bernd Bickel, SIGGRAPH, 2013
Stylized Rendering
Supplemental Background:
- Expressive Rendering, chapter 34 of Computer Graphics: Principles and Practice 3rd Edition
- Suggestive Contours for Conveying Shape, Doug DeCarlo, Adam Finkelstein, Szymon Rusinkiewicz, and Anthony Santella, SIGGRAPH 2003
- Highlight Lines for Conveying Shape, Doug DeCarlo and Szymon Rusinkiewicz, NPAR 2007
Reading:
- Selections from Art and Illusion: A Study of the Psychology of Pictoral Representation, Gombrich, Princeton University Press, 1961
- Where do People Draw Lines?, Forrester Cole, Aleksey Golovinskiy, Alex Limpaecher,
Heather Stoddart Barros, Adam Finkelstein, Thomas Funkhouser, and
Szymon Rusinkiewicz, SIGGRAPH, 2008
Followup Work:
- Image Analogies, Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin, SIGGRAPH, 2001
- Automated Generation of Interactive 3D Exploded View Diagrams, Wilmot Li, Maneesh Agrawala, Brian Curless, and David Salesin, SIGGRAPH, August 2008
Questions and Exercises:
- What are the goals of stylized rendering? There are many; there are many different viewpoints in the field about goals and methodology.
- Prior to this paper, how was line-drawing rendering researched in computer graphics?
- What is the experiment? Focus on the critical aspects of its design, not just generality. Why are the choices that they made important in the design?
- How was the gathered data reconciled across subjects, since each person will draw with a different scale, orientation, and internal proportion?
- What are sources of error in this experiment?
- Why is this experiment significant for computer graphics?
- Try the experiment with a small group. Draw a few of the shapes in isolation.
- Define all of the major curves recognized by computer graphics (e.g., contour) both mathematically and intuitively (you may need to look at prior art to find good definitions).
- Identify in your own drawings which curves you used, and where and why.
- Artists are seeking to convey different things in different styles. Are edges conveying shape or value in your drawing? For what applications would different choices be important? (Consider all of the places that line drawings appear)
- This experiment was a good way of objectively extracting the expertise of artists (although many of the participants don't appear to be very good artists...) However, the scientists then make their own naive subjective statements about the data towards the end of the paper. How could that be improved?
- I asked professional artists about some of the images. The best answer I received to "How would you draw this?" was..."I wouldn't. I'd draw it from another viewpoint. This is an ambiguous composition.". This answer has deep implications for how we should pursue stylized rendering, some of which are (computationally) unfortunate. Unpack the artist's statement, my claim, and the implications.
- Human artists learn by a combination of theory and reinforcement learning. What are the corresponding algorithms? How might that be applied in computer graphics?
Augmented Reality
Supplemental Background:
- Rendering synthetic objects into real scenes, Debevec, SIGGRAPH, 1998
- Simultaneous Localisation and Mapping (SLAM): Part I The Essential Algorithms, Durrant-Whyte and Bailey, IEEE Robotics & Automation Magazine, 2006
- Simultaneous Localisation and Mapping (SLAM): Part II State of the Art, Durrant-Whyte and Bailey, IEEE Robotics & Automation Magazine, 2006
- Simultaneous Localization and Mapping: Literature Survey, Choudhary (unknown)
- Rendering Synthetic Objects into Legacy Photographs, Karsch et al., SIGGRAPH ASIA 20122
Reading:
- Automatic Scene Inference for 3D Object Compositing, Karsch et al., ACM Transactions on Graphics, 2014 (watch the video for Karsch's 2011 paper as well)
Followup:
- AprilTag: A robust and flexible visual fiducial system, Edwin Olson, IEEE Conference on Robotics and Automation, 2011
- Pinlight Displays: Wide Field of View Augmented Reality Eyeglasses using Defocused Point Light Sources, Maimone et al., SIGGRAPH 2014
- High-Quality Reflections, Refractions, and Caustics in Augmented Reality
and their Contribution to Visual Coherence, Kan et al., ISMAR 2012
Post-Modern Video Games
Supplemental Background:
- Three Postmodern Games: Self-Reflexive Metacommentary, Kevin Wong, the Artifice, June 2013
- Half-Life 2, Game
- Half-Real: Video Games between Real Rules and Fictional Worlds, Juul, MIT Press, 2011
- The Aesthetic of Play, Upton, MIT Press, 2015
Reading:
- The Stanley Parable, Davey Wreden, Game, Steam, 2013
- Irma Vep, dir: Olivier Assayas, Film, 1996
- Why Hotline Miami is an important game, Rami Ismail, Gamasutra, October 2012
Followup Work:
- eXistenZ, dir: David Cronenberg, Film, 1999
- Holy Motors, dir: Leos Carax, Film, 2012
- Hotline Miami, Jonatan Söderström and Dennis Wedin, Game, Steam, 2012
- Portal, Valve Corporation, Steam, Game, 2007
- Works of Game, John Sharp, MIT Press, 2015
Questions and Exercises:
This topic is challenging, inherently, and by design.
I want you to still bring a technical perspective and
scientifically-grounded arguments, but you'll use those
techniques to question our goals themselves.
The overall point is: while computer scientists are
trying to solve the problems of rendering 19th and early
20th century paintings (see the expressive graphics
topics from this course), creating special effects
for films that are anchored the 1970's,
and working to build the virtual reality vision articulated in the 1990s,
here is what leading artists are doing in
mainstream visual media today.
Moreover, the examples chosen are self-reflective,
addressing the current art of games and film in a way that is
simultaneously optimistic and highly critical. This
prompts a number of issues:
- are we even solving the right technical problems?
Maybe nobody wants a human-created oil painting any
more, let alone one from a computer.
- Are there technical problems in current forms that we should be
solving?
- What do new works tell us about the
relative importance of technique vs. technology?
- Are new media technologies such as VR going to advance art, or is technology not where the serious problems are any more? Don't dismiss technology out of hand--the automated loom, new paints, the printing press, the camera, sound recording, and the computer undeniably advanced visual media.
- How are the analogous criticisms of film and games implicit in these works likely to
manifest in new media, such as 3D printing, VR, and AR
experiences? Can we minimize those in advance through
careful design?
Binary Shading
Supplemental Background:
- Expressive Rendering, chapter 34 of Computer Graphics: Principles and Practice, 3rd Edition
- Artistic Thresholding, Jie Xu and Craig S. Kaplan, Proceedings of the ACM SIGGRAPH Symposium on Non-Photorealistic Rendering, 2008
- Semi-Automatic Stencil Creation Through Error Minimization, Jonathan Bronson, Penny Rheingans and Marc Olano, Proceedings of the ACM SIGGRAPH Symposium on Non-Photorealistic Rendering, 2008
- Stylized Black and White Images from Photographs, David Mould and Kevin Grant, Proceedings of the ACM SIGGRAPH Symposium on Non-Photorealistic Rendering, 2008
Reading:
- Renaissance, film, dir: Christian Volckman, 2006
- Binary Shading using Appearance and Geometry, Bert Buchholz, Tamy Boubekeur, Doug DeCarlo†, and Marc Alexa, Computer Graphics Forum 2010
Followup Work:
- Sin City, dirs: Frank Miller, Robert Rodriguez, Quentin Tarantino, film, 2005
- Mad World, PlatinumGames, Sega, video game, 2009
Image Self-Similarity
Supplemental Background:
- Expressive Rendering, chapter 34 of Computer Graphics: Principles and Practice, 3rd Edition
- Texture Synthesis by Non-parametric Sampling, Alexei A. Efros and Thomas K. Leung, IEEE International Conference on Computer Vision, September 1999
- Image Analogies, A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Salesin, SIGGRAPH 2001 Conference Proceedings (or, see the CACM version, which was written for a non-graphics audience)
- Patch Match: A Randomized Correspondence Algorithm for Structural Image Editing, Connelly Barnes, Eli Shechtman, Adam Finkelstein,
Dan B Goldman, SIGGRAPH 2009
- The Generalized PatchMatch Correspondence Algorithm, Connelly Barnes, Eli Shechtman, Dan B Goldman, Adam Finkelstein, European Conference on Computer Vision, September 2010
Reading:
- Image Melding: Combining Inconsistent Images using Patch-based Synthesis,
Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B Goldman, and Pradeep Sen, SIGGRAPH 2012
Followup Work:
- Summarizing Visual Data Using Bidirectional Similarity, Denis Simakov, Yaron Caspi, Eli Shechtman, and Michal Irani, Proceedings of CVPR 2008
- Eigenfaces for Recognition, Matthew Turk andAlex Pentland, Journal of Cognitive Neuroscience, vol.3, no. 1, pp. 71-86, 1991
- Image Quilting for Texture Synthesis and Transfer, Alexei A. Efros and William T. Freeman, SIGGRAPH 2001
Synthetic Creatures
Supplemental Background:
- Motion, chapter 35 of Computer Graphics: Principles and Practice 3rd edition
Reading:
Skim all and then pick one to focus on:
- Evolving Virtual Creatures, Karl Sims, SIGGRAPH 1994
- Real-time motion retargeting to highly varied user-created morphologies ("the Spore paper"), Hecker et al., SIGGRAPH 2008
- A New Step for Artificial Creatures, Lassabe, IEEE Artificial Life, 2007
- Unshackling Evolution: Evolving Soft Robots with Multiple
Materials and a Powerful Generative Encoding, Cheney et al., ACM SIGEVO Genetic and Evolutionary Computation Conference, 2013
- Flexible Muscle-Based Locomotion for Bipedal Creatures, Geijtenbeek et al., ACM SIGGRAPH ASIA 2013
- Flocks, Herds, and Schools: A Distributed Behavioral Model, Craig Reynolds, SIGGRAPH 1987
Optional two-week programming assignment:
Implement a 2D point mass + massless spring animation system in the style of (the now-defunct) Sodaplay and evolve creatures for a task. Sample tasks are: run fastest, jump highest, cover the most area (pick up rings). The Sodaplay system is nice because it requires only 2D point and spring physics plus collision with planes; that is a lot less complex than rigid body, arbitrary collision, and 3D physics, which lets you focus on the evolution aspect. The only parameters in this system are: topology, spring constant, spring rest length, mass, motorized spring period, motorized spring phase.
Consider what kind of locomotion you want: friction against the ground plane and gravity lead to walking and jumping, friction against the environment and weak gravity leads to swimming or flying, etc.
You might consider the option of some creatures that can have springs to the ground, making them plants. It is easy to extend to creating an ecosystem under this model by either parallel evolution of different classes of creatures with large populations feeding on one another and competing for reasources.
I hypothesize that one reason many evolved creatures look unnatural is that they lack any model of sensors and delicate internal organs. One way to extend this project is with a notion of sensing located at a specific, vulnerable point, as well as some abstract organ cavity. Creatures would ideally evolve to protect their sensitive bits and favor motions that protect and hold high their sensors.
If you're excited about making your creatures look good as well as animation, you might try fixing the topology and then stretching a sprite over them using As-Rigid-As-Possible Mesh Animation. In this way, you could easily make something that looks like an animated cartoon insect or dinosaur.
Optimal Animation
Supplemental Background:
- Motion, chapter 35 of Computer Graphics: Principles and Practice 3rd edition
- Physically Based Modeling: Principles and Practice, Andrew Witkin and David Baraff, SIGGRAPH 1997 Course Notes
- Multiresolution particle-based fluids, Richard Keiser, Bart Adams, Philip Dutre, and Leonidas Guibas, Technical Report, ETH Zurich, 2007
Reading:
- Synthesizing Physically Realistic Human Motion in Low-Dimensional, Behavior-Specific Spaces, Alla Safonova, Jessica K. Hodgins, and Nancy S. Pollard, SIGGRAPH 2004
- Highly Adaptive Liquid Simulations on Tetrahedral Meshes, Ryoichi Ando, Nils Thurey, Chris Wojtan, SIGGRAPH 2013
Followup Work:
- Real-time Eulerian water simulation using a restricted tall cell grid, Nuttapong Chentanez and Matthias Muller, SIGGRAPH 2011
- Principles of Traditional Animation Applied to 3D Computer Animation, John Lasseter, SIGGRAPH 87
- Nonconvex Rigid Bodies with Stacking, Eran Guendelman, Robert Bridson, Ronald Fedkiw, SIGGRAPH 2003
- Realtime style transfer for unlabeled heterogeneous human motion,
Shihong Xia,
Congyi Wang,
Jinxiang Chai, and
Jessica Hodgins, SIGGRAPH 2015
- Sampling Plausible Solutions to Multi-body Constraint Problems, Stephen Chenney and D. A. Forsyth, SIGGRAPH 2000
- View-Dependent Multiscale Fluid Simulation, Yue Gao, Chen-Feng Li, Bo Ren, and Shi-Min Hu, IEEE TVCG, 2012
Unconventional Animation Algorithms
Supplemental Background:
- Motion, chapter 35 of Computer Graphics: Principles and Practice 3rd edition
- Physically Based Modeling: Principles and Practice, Andrew Witkin and David Baraff, SIGGRAPH 1997 Course Notes
Reading (skim all, and then read and discuss one):
- Cellular Automata for Physical Modelling, Tom Forsyth, Game Programming Gems 3, 2001 (page 200) and
The Making Of Dwarf Fortress (page 9), John Harris, Gamasutra, 2008 (focus on the hard case of volume-preserving water; check out recent results in Voxel Quest and speculate about how that system might work)
- All you need is force: a constraint-based approach for rigid body dynamics in computer animation, Kees van Overveld and Bart Barenbrug, Computer Animation and Simulation 1995
- Plausible Motion Simulation
for Computer Graphics Animation, Ronen Barzel, John F. Hughes, and Daniel N. Wood, Computer Animation and Simulation 1996
- Meshless Deformations Based on Shape Matching, Matthias Muller,
Bruno Heidelberger,
Matthias Teschner, and
Markus Gross, SIGGRAPH 2005
- Smooth Rotation Enhanced
As-Rigid-As-Possible Mesh Animation, Zohar Levi and Craig Gotsman, IEEE TVCG 2003
Psycho
Supplemental Background:
- Psycho: A Casebook, Robert Kolker, Oxford 2004
- Hitchcock's Films Revisited (Psycho chapter), Robin Wood, 2002
- Relevant content from Hitchcock, Truffaut, 1985
Reading:
- Psycho, dir: Alfred Hitchcock, film, 1960
- Psycho Study Guide, Derek Malcolm, Film Education, 1995 You must complete all of the activities from the study guide and bring them to class.
Followup Work:
- Tools for Placing Cuts and Transitions in Interview Video, Floraine Berthouzoz, Wilmot Li, and Maneesh Agrawala, SIGGRAPH, July 2012
- Psycho, dir: Gus Van Sant, film, 1998
- Scorsese on Psycho, excerpt from Hitchcock/Truffaut documentary, 2015