Silsna: Random Design Notes

1. Introduction

This is an RFC, so nothing here is final. We can draw up proper designs based on what everyone else thinks about this and your own ideas. I've written this with Lout, since (La)TeX aren't working on my system right now (don't know why). I should warn you that this document is very random indeed. Don't ask me why section 4.1 has a broken number in the PS version.

2. Coding Practices

It might be a good idea to use a literate programming tool for the source, since that would allow us to generate readable documentation, which would explain the algorithms used. A lot of maths might be involved in the code, so some LaTeX output would be nice.

I'd like to look at building a new version of QefDoc (suggestions for a different name are welcome) in Caml, just to get the hang of it before trying to write the parser in it (see next section). It might also be useful to replace the syntax of the documentation chunks in QefDoc with XML of some form, but I need to know more about what we can do with it first (e.g., can we use XML Style Sheets (XSL), or maybe even SGML ones (DSSSSSSSL) to turn XML into other formats, principally of course HTML for a website and LaTeX). My timid attempts at using an XML parser from Perl have been partially sucessful, but it would be a ball ache to extend that method without using some more sophisticated libraries or programs, which I don't want to write myself.

I think it would be sensible, as POV does, to use a typedef for all the doubles which are used in vectors, transformation matrices and such. Maybe we could use a different typedef for color intensity values, since it might be acceptable for those to be floats instead of doubles, and we could then experiment to see if using floats was good enough and faster.

3. Ray Tracing

A scene consists of a load of objects of various types. Each has standard properties like surface texture and internal material, and might also have properties specific to their type, like radius or height. In the current version of Silsna these objects are just stored in a list which is searched through completely whenever we look for an intersection (I'm not talking about shadow rays though), but we could store them in a tree like structure instead, such as a BSP tree or octree, so that we could avoid testing for intersections with as many as possible of them.

The basic rendering algorithm is: for each pixel trace a ray from the eye position through the center of that pixel's rectangle on the viewing plane and find the first thing it intersects with in the scene. We trest for intersections with an object by transforming the ray from the world space to the model space for that object. For perspective projection the transformed eye coordinate will always be the same for a particular camera and for a particular object, so we can cache it in the object rather than calculate it each time; for orthographic projection the same is true of the direction of the ray. If we find an intersection in the model space then we need to do a transformation back to world space of the intersection point. After all this we need to know whether the ray has intersected anything, and if so where it intersects and how far away the intersection point is from the eye (to calculate attenuation). We can then calculate the color of the surface of the object, and other properties, at that point (by evaluating its texture tree) and use a shading model to work out what color to return along the ray. That depends also on which lights affect the point, which is all of them which aren't shadowed. We determine whether a point is shadowed from a light by looking to see if a ray from the intersection point in the direction of the light intersects anything nearer than the light. We also need to bring into the calculation the contribution from reflection and refraction rays, and ambient light.

If an object has some transparency or translucency then we need to trace a ray through it, the exact direction of which will depend on the normal of the surface, the angle of intersection and the refractive index of the material. The color returned by the refraction ray will depend on the material of the object, which might indicate a solid color throughout (which is easy to calculate if we know how far the ray travels through the material) or it might be a 3D texture or something like a halo (in POV), which needs to be approximated by calculating its color and whatever at regular points along the refraction ray and working out their combined affect of them on the ray. We might want to be able to do adaptive supersampling on those points to get finer results.

TODO: specifying objects, multiple renders of same or similar scene...

If we want to render several pictures of the same scene with the objects slightly adjusted (e.g., for a simple way to do animation without needing any external programs) then we would need to either add the objects to the scene once and then go back and alter them for each frame, or add them afresh in the right state for each frame. Either way requires some access to the scene data which has already been specified from within Sild programs. One idea to do this (for the second method) would be: add the objects which are constant through all frames, push the number of objects in the scene onto some stack (maybe with a push_world directive of some sort) and then add the frame-specific objects and render the scene for the first frame. For the next frame we would unroll the scene graph with something like pop_world and loop.

To add light sources to the scene we would have directives like light and spot_light, which would take parameters like color and position, with suitable defaults. To render the scene we have a camera directive, which we can give a filename to (or omit if we just want to keep the image in memory) and optionally things like position and field of view. There would also be camera_ortho or somesuch, for orthographic projection. The camera directives would act partly as declarations of the parameters of the camera and partly as a Sild function call (as syntactic sugar) which starts the rendering, and they would perhaps return the image as a value so we could tinker with it afterwards.

TODO: cone tracing, radiosity, area light sources...

TODO: object types...

If we have a ‘text’ object, which would be nice, then we need some way of putting a font in. We might make it possible for the input file to specify a list of objects to use as glyphs, so that we could cruft together a font using normal objects. We could wrap some Sild functions round this to allow import of bitmapped fonts by making them from texture-mapped rectangles. Ultimately it would be nice if we could turn a vector (PS or TrueType) font into something we could raytrace, but thats difficult. We might be able to find libraries to help us (e.g., there is a library which can be used to find the curves used in the glyphs in a TTF, which we could turn into a prism or something).

4. Sild

4.1. Overview

Sild is intended to be a programming language which will be, at least partly, built into Silsna. The reasons it was originally planned as part of this project are:

  • To make it easy to build objects algorithmically, e.g., for repetitive structures (see the stairs example below) and fractal shapes, which are not well suited to being built in a GUI modeller.
  • To provide a mechanism for specifying user defined procedural textures, as opposed to built-in ones implemented in C++ (which is what POV provides). These need to be provided to the rendering engine in a form which allows it to evaluate them for any coordinates, so that it can pick the colors needed for the exact intersection points it finds. Good example: one of the two built-in textures in the current version, ‘Wibbly’, is too unlikely to be included in a finished raytracer, but we can make things like that in Sild very easily.

A functional style of language would probably be most appropriate, so we don't need to worry about being able to change objects in-place (except where it would be silly not to, like setting pixels in an image or reading from a file). For most variables we just want assignment to them (I suggest <- as the operator) and finding their values. BTW: I don't think we need user-definable operators, do you?

4.2. Raytracing-Specific Stuff

I'll describe how things work, or are meant to work, in the current version, since I'm mostly happy with it.

To add an object to the scene we give the name of the type of object (call it a constructor), followed by any values which apply to that specific type, followed by the optional parameters. To make it flexible we have sensible defaults for everything, so that a sphere might be created by saying Sphere;, in which case it has a radius of 1 and is centered at the origin. We can specify one or two values before the semicolon to set either the radius (if we give it a single value) or the center position (if we give it a vector, which is just a list of 3 numbers). It doesn't matter which order they are given in.

The ‘optional parameters’ are general to all object types. If they are present they follow a colon after the type-specific values. These things might be transformations (to make the model coordinate system different from the world one) or textures or whatever. They are separated by commas and are combined into one big parameter which the object points to. As transformations are combined we multiply them, and when we've got the whole lot we work out the inverse matrix. Examples of transformations:

Sphere: rotatez PI;              # Rotate round z-axis.
Sphere: rotate PI [1, 2, 3];     # Rotate round arbitrary axis.
Sphere: translate [1, 2, 3];     # Translate by vector.
Sphere: scale 17;                # Uniform scaling.
Sphere: scale [4, 5, 6];         # Uneven scaling.

This, you'll notice, is a lot easier to type than the POV equivalent. I also suggest abbreviations for transformations, e.g., ‘t’ for ‘translate’, ‘rz’ for ‘rotatez’, etc.

Ah! Probably we should make all this functionality available as normal Sild functions too, and make this special syntax compile to normal calls to built-in functions. That would also give us an alternative interface for adding objects and such, which we might make more flexible if there are things we want to add that don't fit into the special syntax but which will rarely be needed.

4.3. Type System

Sild should be able to manipulate variables in a nice easy way. We want, obviously ints/floats, strings (with a little collection of functions to manipulate them, like substr, and a concatenation operator), files (could be a latter addition, since we would have to think about security as mentioned below), lists (and/or tuples) and hashes. Lists and hashes aren't too hard to implement if we use the STL.

For big things like strings their values should be references to the actual thing, and for small things like ints the value can hold everything, like Java. OK, we might find ourselves manipulating large arrays of values (a list of triangles, or a color palette, or whatever) so we want to pass these as references to and from functions, in an imperative style. If we implement lists with something like cons-cells, then we can't efficiently pick out elements by subscript, although we could do things like ‘tail’ very nicely. Should we have both? If we have subscriptable arrays, should they be fixed length tuples? I see no reason not to provide a wrapper around STL vectors (that's what the current implementation does) and provide functions for resizing them, etc. Obviously we can have some convienient list construction/concatenation operators too, like in functional languages.

Just a thought: in the current version I tried to make a string object work just like a list of characters, and a character work just like a single-char string, even though these were all stored in different ways for efficieny. This means we don't need the language to have a seperate way of specifying character literals. To implement this all we need to do is make sure that the operators and built-in functions (or at least the utility functions we use in them, because there might be lots of built-ins) know how to deal with this. I also think it would be nice if, instead of things specific to strings like the substr I suggested above we have generic list/string functions. Its very nice if you can do:

foo <- "fliddle";    # ‘foo’ has type String.
bar <- "z";          # ‘bar’ has type Char, but we can treat it as
                     #   though it were of type String.
head foo;            # The first character of foo.

File IO could be with done a few functions like printf and scanf, starting with just stdin and stdout and posibly adding support for opening other files and manipulating them as values later.

Obviously we want built-in functions to do basic things useful in CG, like trigonometry and matrix manipulation (we could probably just use a list of 16 floats as a matrix value, rather than having a special type for them, but then again it would be more efficient to have a matrix type since we wouldn't have to keep converting between an internal Matrix object and a Sild list, and we could just provide two simple functions to convert between lists and matrices for Sild). As well as bindings for deg_to_rad and rad_to_deg we should also allow a numeric literal to end in the letter ‘d’, to indicate that it is an angle measured in degrees which should automatically be converted to radians. That allows us to make all the trig and transformation stuff use radians without causing an arse-ache for the user. Then we can do things like this:

print (sin (-23d));
Cube: rotatex 45d;

I think it would also be nice to have images as a primitive type, since we'll have a reasonable set of operations defined on them anyway and they could be useful as look-up tables for procedural texture functions or as a means of outputing stuff in a useful way. One nice application: we could write Sild code to post-process the output of the renderer in some way which might be dependent on the camera position and direction (e.g., lens flare) by storing the image produced by ‘camera’ into an image varible and fiddling with it before we write it to a file. The only operations we would need at first on image objects would be reading and writing to files, setting and getting the values of particular pixels, and finding the width and height (and possibly bit depth).

TODO: static vs dynamic typing... The example code in this document assumes dynamic typing because the syntax is easier to guess that way. dynamic typing making easier code... allowing functions which work in different ways, more powerful... printf...

TODO: user defined data types...

4.4. Error Handling

The current implementation has a slightly broken version of exceptions, but in theory it would work. We might not need to catch Sild exceptions in the first version, although it would be useful for testing, so we just raise them and let them filter up to the top level, where they would be caught by the interpretter and printed as an error message. For that to work the exception needs to carry an indication of the error (just a string is good enough) and where it occurred (this isn't done in the current version), which would be source file and line number. If we want a stack trace, which would be very nice for debugging pesky recursive functions, then the exception would need to accumulate this information as it was passed up the call stack. This is starting to look nasty, and if we don't bother with a method to catch the things its rather pointless, so maybe it would be best to do something more like abort(3), i.e., either print a concise error message by default or drop to some sort of debug mode (could be added latter) where the call stack and other state would be available for inspection, or perhaps dump that information into a Sild ‘core’ file which could be inspected by a separate debugger (or by hand if its plain text). Yes, maybe that would be best.

If we go with exceptions then we just need to create an exception object to raise it (since we can't handle them any other way than catching them, just creating an exception value jumps out of the function and returns it). So we might say Exception "balls up" in a Sild function. Probably this would most often be done in built-in functions, though. If we go with the other way of doing things then we could have an ‘error’ or ‘abort’ function, which we give a suitable error message to.

Also, if we don't have exceptions then we have to think for each built-in Sild function whether any errors it could spark should cause the program to abort (should only happen for dire things or things which are easily predictable from the functions args, like division by 0) or make it return some error value.

Here's a little static check we could do, apart from type checking: if the compiler knows whether each built-in function can have side effects then it can also work that out for user-defined functions and for each expression, so we can find statements which can never do anything and print a warning about them, as gcc(1) sometimes does. We could also check for things like code that would never be executed because it is in a while (false) loop or whatever.

4.5. Functions

My current plan for the way to specify arguments to functions is to list them in its definition, and, if we want to take an arbitrary number of arguments, follow them by an ellipsis (...). With no ellipsis the function expects exactly the right number of args, which can be statically checked. If the function definition lists n arguments and has an ellipsis then we expect at least n − 1 arguments, but the function can be called with as many more as we want. The last argument, the one just before the ellipsis, is where all the optional args are put, in a list, which is why n − 1 is acceptable (giving an empty list).

For example, our equivalent of fprintf(3) might be defined as follows (assuming we have support for opening files). The example shows syntax with a simple way of specifying types for the arguments, which in this case would be a good thing, but we can't specify the type of things which would appear in the list of other args (their types would be checked at runtime by the function as it read through the format string, and it would raise an exception or abort if there was a mismatch).

fprintf <- fun File file, String format, List things...:
   # ...whatever, this would be built-in anyway, written in C++.
end;

TODO: closures, ... I don't think we need a separate syntax for named functions, rather than anonymous ones, because we can just assign an anonymous function to a variable to name it. If calling a function requires that it is already defined then we would have to add a special rule that if a function is immediatly assigned to a variable then you can call it from within itself with that name. We would probably want to treat that as a special case anyway to optimize tail recursion.

4.6. Random Note

There is no reason we couldn't write CGI scripts in Sild, although it wouldn't make the best medium for that sort of thing. All we need really is a hash variable to be built at startup containing the enviroment (like %ENV in Perl), which would of course be passed to any programs we spawned if we had facilities like system(3) or popen(3) (but of course we would need to think about the security implications of that, as for our equivalent of fopen(3); the implementation is easy, making sure people don't use it to write virii is not).

4.7. An Illustration

As an interesting example of the possibilities, lets consider a recursive scene showing a painting or whatever on a wall, where the picture in the painting is the scene as a whole, which in turn shows a painting with the scene in, ad infinitum. I've thought of a few different ways of rendering this:

  • A dirty hack: Set the texture of the painting to be the output image which will be produced by the rendering, if it exists, or just grey or whatever if it doesn't. We can then run Silsna to get a picture of a painting which is just solid color, but if we run it again we get a scene with a painting of a scene with a painting which is just a solid color. Run it in a loop with a shell 10 times and it'll look OK. We don't have to worry too much about the bitmap giving us a poor quality texture because it will be smaller than the output image but with as many pixels.
  • A less-dirty hack: Do the same thing, but put the loop in Sild. If we can manipulate images as objects in Sild, and if the ‘camera’ directive can feed its output into a variable instead of a file, then we can do this without making temp-files. We would just have a Sild for loop or whatever building new texture images to feed back into the next render before finally writting one out as the finished scene.
  • An elegant hack: If we can access bits of the rendering engine from Sild, so that we can trace arbitrary rays and find out what color they return, then we could make a procedural texture in the painting which, when Silsna evaluates the color of it at some point, would trace a ray itself in the scene from the camera position, in a direction determined by the texture coordinates. Of course this would happen recursively for some points on the texture, and we would have to keep a count of how many levels we have gone to so that the texture function could return a constant color at a certain depth. It might look something like the code below.
depth <- 0;
camera_pos <- [1, 2, 3];

painting <- fun x, y:
   local color;         # local variable?  whatever...
   depth <- depth + 1;

   if depth = 10:
      color <- GREY;    # give up, we've gone too deep.
   else
      dir <- ... calculate the direction of the ray using (x, y) ...;
      color <- trace_ray (camera_pos, dir);
   end;

   depth <- depth - 1;
   color;               # return the last value, like Caml & Perl.
end;

# ... here we would use ‘painting’ as a texture function.

camera "recursive_hack.png", camera_pos;

4.8. Other Neat Things We Could Add

Here are a few other things which are cool in other languages, which we might want to implement in Sild if we could do so cleanly.

Macros in Lisp (note that I haven't used these, so I might be wrong about all this): a lisp macro is like a function except that it gets passed the expressions it was given as arguments rather than the value returned by evaluating them. They come in the form of trees of cons-cells. The macro can then evaluate or splice about the expression trees as it wants, and build a new expression, its return value, which will be evaluated to give the macro's result. This is great in Lisp because the language's expressions are so easy to represent as simple trees. If Sild is interpreted as a series of expression trees then we could theoretically build some interface for manipulating them, but if its interpretted as byte-code then all bets are off because by the time the macro runs the expression trees have been turned into instructions. I don't think we can use macros.

Currying: I don't really know how this would be done, but it would be one way to create procedural textures which take parameters. We could make a Sild function (or built-in, if we can do currying with those) which takes some parameters which affect the texture followed by the coordinates. We could then make a curried version by applying it to the parameters we want, to give a function which takes only some coordinates, and we could then use this as a procedural texture as described somewhere below.

4.9. Example of Making a Complex Object

As an example of Sild lets make a program to build a flight of stairs, as shown in the diagram below. The stairs are defined by their height (h), width (w) and depth (d), and by the number of stairs to spread over the flight (n). The blob shows the origin, and the stairs are meant to extend back along the negative z axis and up the positive y axis. We will build the stairs as two rectangles for the bottom and back, to polygons for the sides (these are concave, of course, but that shouldn't be too much of a problem in a ray-tracer, although it might confuse an OpenGL previewer) and two rectangles for each stair.

Note that this program relies on the ‘Polygon’ constructor understanding two different formats for its vertices: either seperate vectors or a single list of vectors.

TODO: code...

4.10. Compiling to Byte Code

The current imlementation has the parsing of Sild and interpretting it all mixed up together, which makes it a bit messy and doesn't allow alternative syntaxes, or code created in other ways to be fed into the interpreter. Gareth's solution to this is to make a Sild compiler, which would be a completely separate program to the renderer and which would parse the Sild source code and output some sort of binary object file which can then be executed by the interpreter. We can make this mostly transparent by having the interpreter/renderer run the compiler if it is given a source file. Advantages: obviously can save time if we want to run the same program multiple times (but probably not much); makes it easier to split the work up because we can create test object files by hand and examine the output of the compiler with simple utilities; makes the language design easier to implement, because we can add some nice new syntactic sugar to the language and modify the compiler to understand it, but if the the byte-code format already has everything we need then the compiler will just convert it to something more basic, so we wouldn't need to alter the interpreter; allows us to generate code (object files) from other sources in the future, like a language with completely different syntax (but not semantics).

If the byte-code interpretting is used, together with some of the later ideas in the ‘Error Handling’ section above, then Silsna might start to look like the diagram below. The dotted lines connecting the interpreter to the renderer are meant to suggest that they are closely connected, probably part of the same executable or else one of them (not sure which) in a shared library linked to the other. A latter version of the debugger might also be connected to the interpreter so that expressions can be evaluated and such interactively.

4.11. Scary Stuff

TODO: solving algebra to define parameters of the scene, and specifying objects parametrically. could provide this as a set of function calls, so it wouldn't affect the interpretter, but it would be nice to have syntactic sugar (and if the expressions are parsed in the compiler rather than being passed to some algebra engine as strings then it can check them statically).

5. Reading and Writing Images

I think we should provide support only for PNG and PNM pixmaps for now, since PNG provides all the features we need and supporting PNM allows us to make libpng an optional dependency. It should be possible to remove PNG support at compile time (in which case output would default to PNM). PNM's also are easy to read and write, allowing Silsna to more easily be integrated with whatever little programs a user wants to hack up.

TODO...

6. Texture Mapping

6.1. Pixmaps/Images

TODO

6.2. Procedural Textures

TODO

6.3. Calculating Texture Coordinates

For a 3D texture, the texture coordinates are simply the result of applying an optional 3D transformation to the intersection coordinates. If this transformation stays constant as we transform the object then the texture will stick to the object, which is usually what we want. We could set the texture transformation to the inverse of the modelling transformation to make the object appear to swim through the texture, if it pleased us.

For a 2D texture we choose some algorithm to map the 3D intersection coordinates to 2D texture coordinates. We should be able to specify a 2D transformation matrix to be applied to the texture coordinates after this mapping has been performed, and it might also be useful to allow a 3D transformation beforehand, but we would rarely use them both at once. POV-Ray provides a number of algorithms for mapping 3D to 2D pionts (spherical mapping, cylindrical, and planar mapping (simply removing one of the values of the 3D vector, if we use a plane whose normal is an axis)). We should also supply the method OpenGL uses (and VRML with a textured ‘IndexedFaceSet’), which is to specify for each vertex of a polygon the texture coordinate it maps to and interpolate between them. To give complete flexibility we could even allow a Sild function to be given to do the mapping.

6.4. Combining Textures

If each object has a texture (a pointer, probably, to some object which would either be kept track of in one place or reference counted) then we can mix textures by providing a new class of texture, which would be a list of other textures (or pointers to them). I suppose we could find the mean of their colors as the color of the whole thing, or use the technique in the next section to combine them to varying extents.

Once we have this new texture class an object's texture can be an arbitrary tree of textures, which would be evaluated for each point from the root to all the leaves. Branches which can make no significant difference could be pruned.

6.5. Mapping Floats to Colors, Etc.

TODO

6.6. 2D Vector Graphics

Some textures lend themselves better to being declared as a collection of shapes in a 2D space rather than querying the colors of whatever points we need. For example, a model of a ruler would have lines on it marking the divisions and numbers counting them, and there are many objects which have that sort of surface detail. The POV-style way to solve that is by putting polygons or whatever in 3D space just above the surface of the object, but that sucks. If we could specify a texture made up of 2D shapes we could write a loop to define lines of whatever thickness for the graduations and (eventually, if we have some font support) put the numbers in too. Whether we could implement this without messing up the input language would depend on the final form of the way for specifying 3D objects, so we should keep an eye on it.

We could achieve a 3D equivalent of this with transparent bounding objects intersected with the objects making up the texture.

6.7. Rendering 2D Textures in 2D

For 2D textures it would be nice if we could use a 2D camera directive to render the texture strait to the output bitmap. This would give us an easy and fast way to look at textures as we develop them, and it wouldn't be hard to implement. For 3D textures we could possibly render a slice through them to a bitmap, which might be useful in the same way. This 2D rendering would also be excellent for puting example textures in documentation.

This facility could also be used by an interactive previewer (presumably OpenGL) to put bitmap approximations of fancy textures on objects, if we can work out how to convert the method of turning the intersection coordinates to texture coordinates into OpenGL's system.

6.8. Random Number Generation

Almost forgot, its very important that the random number generation used for turbulence (or by Sild programs accessed from some random number function) can be made to produce the same results for each run. For this we need to implement our own random number function (just steal the algorithm of random(3) from glibc) so that we can guarantee we get the same algorithm on any system, and provide a way of setting the random seed to use from Sild.

7. User Interface

The primary interface to Silsna will be on the command line, obviously. That should be nice and easy to use, once it has been learnt. We should have sensible default values for everything so that you can run silsna without any options or filenames (in which case it would read a scene from the standard input). We have to think about what parameters (e.g., output size, format and filename) can be specified on the command line and which in the Sild input file, and which gets priority if both.

We also want to be able to put the output images in an X window so that we can see what it's doing. This is pretty much OK in the current implementation, although it needs to be tidied up and the X code probably isn't very portable. All this simple GUI stuff should be put in a separate module so that we can add alternatives for Windoze, etc. It should be possible, as it is in the current version, to compile without any of this stuff so that we can compile a smaller Silsna, or compile it on systems without any of the supported windowing systems.

The only things we need for this, I think, are: opening and closing windows, drawing images or pixels into them, waiting for a key press or mouse click in a window, checking (in a non-blocking way) if such an event has occured and setting the window's title. I see no reason why there shouldn't be Sild bindings for these facilities, but there is no need to make new primitive types to represent windows and things. Royce has suggested that we allow the drawing of construction lines or grids to help us see where we are looking in a scene, and we could do that from Sild if we add functionallity to draw a line in a window and to do a projection of a world coordinate to a point in the window.