This tutorial shows how to create as much geometry as possible in a finite amount of RAM in a procedural.
Procedurally generated geometry for rendering in Arnold goes through 3 phases of RAM consumption. The first phase is the generation of data, next is the filling of Arnold's data structures, and last is the ray acceleration data structure creation. (the Bounding Volume Hierarchy, or BVH, is usually the largest) When this is all done, the actual rendering begins, which does not use much more RAM. Directly building arrays of data into Arnold's data structures is the most efficient method, if it is possible. Users do not have control over RAM consumption of the BVH.
In our example, we are generating a "Mandelbulb" 3D version of the Mandelbrot and Julia sets. This algorithm involves iterating a function (Z^n+C) and seeing if it exits a sphere of radius 2; if it stays inside for a set number of steps, it is considered a "prisoner point" and a small sphere is put there. Rendering a mandelbulb as a dense grid of spheres is neither the most elegant nor the most efficient way to display this mathematical entity; we are just using this as a method to create large amounts of data for the purposes of this tutorial.
This animation shows the sphere size in a closeup on a mandelbulb made from approximately 1/4 of a billion spheres. It was rendered on a laptop computer in 8 GB of RAM:
It is not possible to know how many spheres we will have when we are done, so we cannot fill Arnold's arrays directly; instead we fill a linked list and then use that to fill the arrays in a second pass. If we fill RAM with a giant linked list of all spheres, we would then need to allocate an array for Arnold to copy the data into; we would only be able to use half of the RAM in a system using this method. Instead, in our example, we break the task into smaller chunks, with each chunk filling a linked list, allocating an array, and then deleting the linked list. This allows us to fill our RAM to the top with renderable geometry. For our example, we simply broke the mandelbulb up into slabs on the X axis.
Running the math that generates the points takes CPU power, and many modern systems have access to multiple CPUs on a single system. In order to fill the RAM as efficiently as possible, we take each of our chunks and break it into sub-chunks, allowing a single CPU to fill each of those in parallell, using Arnold's AiThreadCreate(). There are numerous other accelerations that can be incorporated into this, such as SIMD sin/cos functions or possibly offloading computations to a GPU or other methods, but we leave these types of optimizations out of our example.
Below is the source code for the procedural:
The following .ass file generates an image similar to the one above:
Here is a high-resolution, 2.5 billion sphere render by Thiago Ize and some close-ups rendered by Lee Griggs.