**CPU Real Time Rendering - Report** **Alexander Majercik, Ryan Patton, Tuan Tran & Eli Meckler asm4@williams.edu rjp2@williams.edu mt12@williams.edu ecm3@williams.edu**  Introduction ============================================ We present a CPU Real Time Ray Tracer based on Whitted Ray Tracing [#Whitted1980] with constant-time light approximation. Our Ray Tracer uses batch-processed CPU multithreading with *barrier synchronization* to trace primary rays, and recursive Whitted Ray Tracing to capture the impulse rays from reflective and refractive pixels. Topic Overview ============================================================= The main challenge for a CPU real time renderer is speed. The program must be able to render scenes fast enough to produce tens of frames per second (FPS) without sacrificing image quality. The standard Ray Tracing and Path Tracing algorithms we have learned are not fast enough to render scenes in real time using just the CPU. In order to achieve a sufficient speed-up, we have implemented the Whitted Ray Tracing algorithm [#Whitted1980]. This algorithm exploits the insight that tracing secondary rays to augment the radiance of a surfel most strongly effects reflective and refractive surfaces. The algorithm then traces all primary rays from the camera to the scene, but only casts secondary rays for surfels with reflective and refractive materials. In order to efficiently determine which surfels have these properties, we calculate an *impulse* for each surfel (reflective, refractive, and otherwise) that yields either no secondary rays (surfel does not reflect or refract), one secondary ray (surfel reflects or refracts, but not both), or two secondary rays (surfel both reflects and refracts). This implementation allows us to achieve the most essential parts of reflection and refraction while maintaining the ability to render in real time. Impulse Diagram ------------------------------------------------------------------------------- ************************************************************************************************* * ------------- Reflection * * ^ \ * * / \ ^ * * / v / * * Eye / / / * * +) ---------------------------------------->+-. Reflection and Refraction * * \ \ | \ | * * \ '-+ * * \ | * * \ | * * \ | * * v No impulse v * * --------------------------------------------------------------------------------------------- * ************************************************************************************************* *Modified from Graphics Diagram in [Markdeep Feature Demo](https://casual-effects.com/markdeep/features.md.html)* The Whitted Ray trace algorithm can render images in real time. The rendered images produce most of the phenomena of an path traced image, with some exceptions. Because secondary rays are not cast from non-reflective, non-refractive surfaces, the algorithm does not render caustics or produce diffuse interreflection. Specification ============================================================================================ (Adapted from [Paths](http://graphicscodex.com/projects/paths/#toc2) lab spec) Implement a real-time ray tracer with the following properties by refactoring the [Paths](../paths/index.html) project to/with: 1. Batch-processed CPU multithreading with *barrier synchronization*, where all primary rays are traced in parallel, shaded in parallel, and then scattered and re-launched according to the Whitted Ray Trace algorithm. 2. A method `cpuRealTimeWhitted::traceImpulse(const Ray& ray, const int recursiveDepth)` that recursively traces impulse rays for reflective and refractive surfels. 2. Soft shadows through ambient lighting, mirror reflections, and refraction, at ~20FPS and 420x300 resolution. 3. Use a library-provided *spatial data structure* (i.e., [`G3D::TriTree`](http://g3d.cs.williams.edu/g3d/G3D10/build/manual/namespace_g3_d.html#a31cf5746e822e5265b8f40c5de0847d4)) for accelerating ray intersection and a library-provided *importance-sampled scattering function* (i.e., [`G3D::Surfel::scatter`](http://g3d.cs.williams.edu/g3d/G3D10/build/manual/class_g3_d_1_1_surfel.html#aa10ef7078e60effdfa4841b3c27353e2)) 4. *Emissive*, *direct* illumination from point sources at finite locations, and *indirect* illumination terms calculated by tracing impulse rays as per the Whitted Ray tracing algorithm. 5. *Post-processing* including gamma encoding (e.g., via [`G3D::Film::exposeAndRender`](http://g3d.cs.williams.edu/g3d/G3D10/build/manual/class_g3_d_1_1_film.html#a7e00177f4f5c6e18fef7c2ba56d99f63)) Program Design ============================================================= Major Class/Function | Description ---------------------|------------ [class cpuRealTimeWhitted](../build/doc/classcpu_real_time_whitted.html) | Casts primary rays for all pixels, and secondary rays for reflective/refractive surfels [cpuRealTimeWhitted::setScene](../build/doc/classcpu_real_time_whitted.html#a5b610d0b96c9c3dc7046668e62dba836) | Initializes TriTree, skybox, prunes light array [cpuRealTimeWhitted::traceImage](../build/doc/classcpu_real_time_whitted.html#ada68cdf31398b14a063154579201c232) | Calls helper functions to render image [cpuRealTimeWhitted::generateRays](../build/doc/classcpu_real_time_whitted.html#a84f112eb77a1e98080b5e8a0cc5ee010)| Generates all primary rays [cpuRealTimeWhitted::findIntersections](../build/doc/classcpu_real_time_whitted.html#acfe28e6251eafdc45c3e7657cf68d687) | Finds intersections for all primary rays [cpuRealTimeWhitted::writeEmittedAndReflectiveAmbient](../build/doc/classcpu_real_time_whitted.html#ae66ab7a46a7a1c5a2b17fd1112b015ea) | Writes Radiance values for ambient light and emitted light to the radiance buffer [cpuRealTimeWhitted::findRadianceAndShadowRays](../build/doc/classcpu_real_time_whitted.html#aebf13da5cf7a7c496b5953eaa9137c41) | Gets direct illumination and calculates shadow rays [cpuRealTimeWhitted::checkShadows](../build/doc/classcpu_real_time_whitted.html#a242d0d136168cd56491420f86fd71e67) | checks if shadow rays hit the surfel [cpuRealTimeWhitted::addShading](../build/doc/classcpu_real_time_whitted.html#ac145bb3519ed230777d8164d1299c4cd) | shades surfel [cpuRealTimeWhitted::addImpulses](../build/doc/classcpu_real_time_whitted.html#a5e4f24f967f3a47a188d11ae43d0c01e) | computes radiance from impulse rays [class App](../build/doc/class_app.html) | Initializes GUI, renders images in real time using cpuRealTimeWhitted [App::pathTraceScene](../build/doc/class_app.html#ab0f8a2bf9088d6798089e64e4b4c5c0c) | creates the image if no image exists, calls cpuRealTimeWhitted::traceImage to render image **Implementation Design** We modify the code of App.cpp: We modify `onGraphics3D` to update the camera image when the camera moves or the scene changes, which allows our renderer to run in realtime. To `onGraphics3D`, we add a method `pathTraceScene`, passing it a scalar value that brings down the resolution of the image to allow the renderer to run in real-time at higher FPS. `pathTraceScene` creates an instance of PathTracer and calls `PathTracer::traceImage` which renders the scene using the method described below. We implemented the Whitted Ray Tracing algorithm building off of a Path Tracer written by Eli and Tuan. The path tracer is multithreaded when casting the first rays from the camera into the scene, but further rays are cast recursively per surfel depending on the impulses of the surfel. `Surfel::Impulses` gives an array of impulses (there can be at most two) for each surfel. `rayCast` casts a ray corresponding to each impulse, and calculates direct light and ambient light at that surfel, again recursively casting more rays depending on the impulse values of the given surfel. All code is built upon the [G3D API](http://g3d.cs.williams.edu/g3d/G3D10/build/manual/apiindex.html). Our program is designed with three values in mind: image correctness, speed, and readability. Image Correctness ----------------------------------- There are a couple ways to implement the Whitted algorithm for a real time renderer. To avoid the branching factor of two, we can restrict the impulses to similar-type impulse paths. For instance, after the initial refraction impulse ray cast, only reflective impulses are ray cast. The same goes for the reflective impulse. The downside to this method is that, despite eliminating the branch factor, it does not ensure image correctness. A mirror will look black if viewed through a glass, and a mirror appears opaque when its reflection is seen in a mirror. In this way, we lose the correctness of the Whitted algorithm. Instead, we chose to maintain the branching factor to ensure image correctness to an arbitrary impulse depth. Our recursive call to cast impulse rays follows both impulses at a surface (assuming both exist). Speed ----------------------------------- Maintaining a low render time while maintaining image correctness is challenging on a CPU. Since correctness for the Whitted algorithm requires following two impulse paths, we expect $ O(2^{d}) $ recursive ray casts, where $ d $ is our recursion depth. On images with mirrors and refracting surfaces, this exponential factor results in a huge slow-down. To counteract this effect, we maintain concurrency at the highest level. All eye ray casting is performed in parallel, and each impulse recursion tree from each eye ray is run in parallel. We take advantage of batching the cache through the use of buffers to store all pertinent values between components: rays, surfels, radiance, and booleans are all stored in arrays for quicker parallel access. Each recursive impulse tree is called in parallel, but this step cannot take much advantage of caching. Readability ----------------------------------- Our code is broken up into concurrent calls across all pixels, but helper methods called on individual elements in the buffers help clarify the logic of the program. This helps the reader to not think about the concurrent calls happening at once, but rather how each one runs in isolation of the others. Correctness Results =============================================================    Performance =========================================================== Component | Time at 480x300 (ms)| Time at 1280x720 (ms) | Time ratio :------------------|:---------------:|:--------------------------:|:---------------------: `generateRays` | 3.6 | 23.0 | 6.4 `findIntersections`| 35.2 | 190.1 | 5.4 `writeEmittedAndReflected` | 3.0 | 18.1 | 6.0 `findRadianceAndShading` | 8.7 | 57.6 | 6.6 `checkShadows` | 9.4 | 40.5 | 4.3 `addShading` | 0.2 | 1.5 | 7.5 `addImpulses` | 22.1 | 120.9 | 5.5 `updateImage` | 2.6 | 22.4 | 8.6 **Total** | 84.8 | 474.1 | 5.6 [Component runtime on Sponza Mirror at different resolutions. 1280x700 is six times the pixel count of 480x300.] On average, the runtime scales linearly with the number of pixels in a given scene. By increasing the pixel count six-fold, we increase runtime to just under that amount. Certain methods perform better as they scale to greater numbers of pixels. For instance, `findIntersections` runs 5.4 times slower at the higher resolution because its overhead is distributed over more pixels. Other methods, such as `findRadianceAndShading`, did not scale as well, running 6.4 times slower for six times the pixels. Overall, image tracing took 5.6 times longer for six times as many pixels. Note that in this scene, `addImpulses` is not taking as much time as it would in worst-case scenarios. Relatively few pixels require impulse ray casting, which is the recursive, non-batched component of our algorithm. The number of impulsive ray casts at a given depth is exponential with respect to depth in the worst-case scenario (where all impulse rays are further casted into two new impulse rays). In this scene, not only do they make up a small portion of the screen, but the mirror floor, while casting two impulse rays, performs no further ray casting, as no other surfaces in the scene have impulses. For a complex scene with high impulse depth, `addImpulses` will take much longer to run relative to the other methods, but will still run linearly with respect to the pixel count. Quality Results =============================================================  The teaser video demonstrates our program's ability to handle multiple lights, multiple impulses, and the skybox while running in "real time." The reflective floor in Sponza Mirror actually has a refraction impulse that allows us to see the floor tiling pattern directly underneath it, which is why the floor texture is still visible on the mirrored surface. Both lights affect the scene at all points where their radiance is non-negligible. Finally, the skybox is visible in the reflection off the floor.  The flythrough of the landscape procedurally generated by Yitong, Kenny, Cole, and Melanie demonstrates our program's ability to handle very large scenes. Evocative Result =============================================================  Self-Evaluation ============================================================= Specification: Our specification is scoped appropriately to the project scale. We sought to refactor our Paths project to perform Whitted Ray tracing, a task amounting to about half the specification of a full weekly project. Our specification is unambiguous because it consists of objectively verifiable features that are presented clearly and concisely. Learning: We learned to coordinate a large project in a group of four programmers. Because of the size of the group, direct person-to-person communication was not always possible, even for the larger changes in our program. Thus, clear communication in the journal through precise documentation of changes and bugs was essential. The reduced specification and longer development time allowed us to get experience polishing a project in order to best show off its features. We were able to invest substantial time into producing demo videos, through which we gained experience in filming a scene to capture its salient features. Workflow: Our workflow was good. We addressed bugs quickly as they came up, and documented them in the journal. We documented our updates, and any problems during implementation. Our workflow could be improved by better scheduling of tasks: group members simply advanced the state of the project given whatever they found it. More efficient implementation could be achieved by partitioning related tasks. Code Evaluation: After multiple refactorings, our code is well-structured and easy to understand. Report Quality: Our report is clear and easy to understand. We divided our into sections given by the meta-specification, and ordered those sections acording to logical presentation order. Our report looks polished and professional. Overall, it is as brief as possible while conveying necessary information. Verbose writing occurs occasionally in order to ensure that all relevant concepts are presented clearly, and only when we could not find a more concise formulation. Schedule ============================================================ Component | Deadline | Personnel :----------------------------|:---------------:|:---------------- Realtime camera in `App.cpp`| 10/13 | Ryan Recursive impulse casting | 10/13 | Zander Updating concurrency | 10/13 | Tuan Integrating impulse casting | 10/13 | Eli Finalizing impulse algorithm| 10/19 | Eli, Ryan, Zander, Tuan Skybox support | 10/19 | Ryan, Tuan Refactoring | 10/19 | Eli, Ryan, Tuan Optimization | 10/19 | Eli, Ryan, Zander, Tuan Time tests | 10/19 | Eli Final scene selection | 10/19 | Zander Video creation | 10/19 | Ryan Report | 10/19 | Zander, Eli Presentation | 10/20 | Eli, Ryan, Zander, Tuan Change Log ============================================================ 1. Implemented a 2N length array for processing shadow rays. Eli, 2016-10-08 2. Inital implementation of real-time rendering using Morgan's renderer. Ryan, 2016-10-08 3. Decided to implement recursive rays to achieve maximum correctness. Eli + Zander, 2016-10-10 4. Initial implementation of recursive ray-tracing method, 'rayCast'. Zander, 2016-10-10 5. Debugged/corrected to **working rayCast** and rendered first image. Eli + Zander, 2016-10-10 6. **Working real time rendering** with our renderer, **MVP complete**. Ryan, 2016-10-11 7. Refactor direct light tracing to iterate through all light sources. Tuan 2016-10-12 7. Skybox and ambient light added. Ryan + Zander, 2016-10-12 8. Speed tests, ours and against Morgan's code. Eli, 2016-10-13 9. Skybox shooting and debugging. Tuan, 2016-10-14 10. Large refactoring complete. Eli + Ryan + Tuan, 2016-10-17 11. Fixed bugs with shadow rays and refactoring. Eli + Ryan, 2016-10-18 12. Rafactored code for reusing between direct and indirect light tracing, **full code polish complete**. Tuan. 2016-10-18 12. Import scenes from other groups, report refactored and ready for polish. Zander, 2016-10-19 13. Report complete, project ready for presentation. Eli + Ryan + Tuan + Zander, 2016-10-20 Skills ============================================================ - Adapting and refactoring code from `PathTracer` to trace direct rays from all sources while approximate ambient lighting and indirent lighting. - Wrapping concurrent direct rays tracing inside a loop over all lights while tracing indirect ray using mainly loops and recursion (non-concurrent). - Refactoring code in both direct ray and indirect ray tracing to achieve maximum code reuse. - Other techniques for speeding up Whitted Ray Tracing: not tracing ray with low radiance, not tracing non-existent impulses, etc. - Group workflow, using SVN and journal effectively to track and communicate changes. Acknowledgements ======================================== Thank you to Kenny, Yitong, Melanie, and Cole for the procedurally-generated landscape displayed in the quality and evocative results sections. All scenes are variations of default scene files in G3D. **Other models used in landscape scene:** razrushite134 [Shakirov Rinat] (2015) "Low Poly Sheep" [3D model] cgtrader.com https://www.cgtrader.com/3d-models/animals/other/low-poly-sheep Accessed 10/21/2016. Simon Telezhkin (2016) "Low poly tree sample" [3D model] turbosquid.com http://www.turbosquid.com/3d-models/sample-trees-c4d-free/1008420 Accessed 10/21/2016. PaulsenDesign (2015) "Cartoon low poly tree" [3D model]. turbosquid.com http://www.turbosquid.com/3d-models/free-3ds-model-trees-scene/961487 Accessed 10/21/2016. ProArch3D (2016) "Low poly clouds 3d model" [3D model]. proarch3d.com http://www.proarch3d.com/low-poly-clouds-3d-model/ Accessed 10/21/2016. **Music in teaser video:** ALBIS. _Juicy_. Youtube Royalty Free Music, 2014. https://www.youtube.com/watch?v=wo2PX-bctwE **Music in evocative video:** Gunnar Olsen. _Never Sleep_. Youtube Royalty Free Music, 2014. https://www.youtube.com/watch?v=CxevdmHT3O0 Bibliography ======================================= [#Whitted1980]: Turner Whitted, An improved illumination model for shaded display, pages 343-349, Communications of the ACM, June 1980