A Ray Tracer - Part 2

Tags: C, Graphics, Mathematics, Programming, Ray Tracing

Last time we build up the basics of the raytracer: some essential math and the corresponding function implementations and structures in the C programming language. We will expand the ray tracer this time to generate actual graphics and to generate these images with multiple spheres, light sources, and reflections.

Painter

If you compiled and ran the example attached to the last article, you should have noticed a console output which looks like this:

Raytrace ASCII

Not super exciting, I know, but it verifies the correctness of our ray/sphere intersect function. Let's take this one step further and create an actuall graphic which you can open in an image viewer.

Computer images are built up out of pixels (picture elements). Each pixel can represent any colour made up from the three primary colours red, green and blue. When we say that an image is '24-bit colour', we mean that each pixel is built up out of 3 values, each 8 bit long. In this way, we can let the red, green and blue component vary in value between 0 and 255 because $ 2^8=256 $. With 24 bit colour, we have enough to represent over 16.7 million colours.

Let's start with something simple. Suppose we want to programatically create the Finnish flag:

Finnish Flag

The first thing we need to do is agree on the format we're going to use. There are many image formats such as jpeg, png, gif, etc. Most of these are quite complex to generate unless you use existing libraries. For this tutorial however, we will use something very simple: the ppm (portable pixmap) format. The ppm format is a simple uncompressed format from the Netpbm family of image formats. The PPM format has the following components (also called the header) needed for the file to be identified and displayed correctly by the image viewer:

  • A "magic number" identifying the file type, in our case this will be "P6"
  • Whitespace (any TAB/SPACE/NEWLINE)
  • Width of the image, ASCII decimal representation
  • Whitespace (any TAB/SPACE/NEWLINE)
  • Height of the image, ASCII decimal representation
  • Whitespace (any TAB/SPACE/NEWLINE)
  • Maximum colour value, ASCII decimal representation
  • Whitespace (single character, usually newline)
  • Actual image data

Everything up until the image data can be written in C as follows:

fprintf(f, "P6 %d %d %d\n", width, height, 255);

Where 'f' is the pointer to an opened and writeable file. We will get back to this later. 'P6' in the PPM format indicates 'raw' PPM format. You can read more about the details in the links mentioned above.

So what is the image data composed of? It simply is the colour information per pixel, for all the pixels in the image. In other words it is the data represented by 3 bytes per pixel (8 bit per colour, 24 bit total) multiplied by the width and the height of the image. Therefor, a 24 bit image, measuring 800 pixels wide with a height of 600 pixels will contain 3 x 800 x 600 bytes of data, making 1.44MB of data. Interesting note: this is the amount of data you could store on a standard PC formatted double sided high density 3 1/2 inch floppy disk.

To open a PPM file, you need to install an appropriate viewer since Microsoft Windows does not come with PPM support by default. Linux should be able to open it out of the box. For Windows (and Mac) users, I've found these programs that should be fien for displaying PPM images:

For Windows: http://www.irfanview.com/

For Mac: http://www7a.biglobe.ne.jp/~ogihara/software/OSX/toyv-eng.html

To write a PPM file, we can write a function in C which looks like this:

/* Output data as PPM file */
void saveppm(char *filename, unsigned char *img, int width, int height){
   
    /* FILE pointer */
    FILE *f;
   
    /* Open file for writing */
    f = fopen(filename, "w");
   
    /* PPM header info, including the size of the image */
    fprintf(f, "P6 %d %d %d\n", width, height, 255);
   
    /* Write the image data to the file - remember 3 byte per pixel */
    fwrite(img, 3, width*height, f);
   
    /* Make sure you close the file */
    fclose(f);

}

The pointer *img points to our image data, which consists of 3*width*height bytes of data, representing each pixel with its three colour components. Since each colour component can vary between 0 and 255, we can easly create the two colours needed for our flag. White can be represented with each component fully on, that is r=255, g=255, and b = 255. Similarly, the colour blue could be r=0, g=0, and b=128. This will give one shade of blue; you could use others by varying the value of each component.

The image data itself we can store in an array. Later on we can use dynamic memory allocation, but for now, a statically allocated array will do fine. Inside this array, we have to set the correct pixels white and blue in order to reporesent the Finnish flag. This is a matter of pretending you're in front of a canvas and drawing certain sections in a certain colour.

In C, we could do this as follows:

#define WIDTH  800
#define HEIGHT 500

int main(int argc, char *argv[]){

        unsigned char img[WIDTH * HEIGHT * 3];
        int i,j;

        for(i=0; i<HEIGHT; i++){
                for(j=0; j<WIDTH; j++){
                        if( (j>=250 && j<350) || (i>=200 && i<300) ){
                                img[(j + i*WIDTH)*3 + 0] = 0;
                                img[(j + i*WIDTH)*3 + 1] = 0;
                                img[(j + i*WIDTH)*3 + 2] = 128;
                        }else{
                                img[(j + i*WIDTH)*3 + 0] = 255;
                                img[(j + i*WIDTH)*3 + 1] = 255;
                                img[(j + i*WIDTH)*3 + 2] = 255;
                        }
                }

        }

        saveppm("image.ppm", img, WIDTH, HEIGHT);

return 0;
}

Within each iteration of the inner for loop, we draw one pixel. After one complete cycle of the outer for loop, one horizontal line is fully drawn. Within the inner for loop, we decide the colour each pixel will have, and set it's triplet (r,g,b) to the correct values. Whether a pixel becomes blue or wite is defined by the if statement in the inner for loop. It just checks if we are in the area of the flag that needs to be blue or white, calculated in advance knowing the width and height of the image. At the end, the image is written to a file, and you should be able to open it with an image viewer. The full version of this program is attached to this post (flag1.c).

Back to our ray tracer. We can now go back to the program we made last time and, instead of generating an ASCII output with a couple of printf() statements we can colour our sphere in any colour we want. We can also generate an image that is much larger than the 40x40 'image' we created before. As an example, let's build an 800x600 image with a red sphere on a black background. The resulting program is also attached (raytrace_sphere_1.c).

Hardly realistic, but it provides a start to make things look much better. The reason why it doesn't look realistic is because we're missing probably the most important element of a scene: "Lights, Camera, Action". We have the camera and the action (the sphere) - now all we need is light! First thing we want to do is change the way colour is assigned. After all, colour is a property of the object in question and should hence be assigned as such. We can change our code a little so that we have an element called 'colour' which contains the three colour components of the final shade. We can easily do this again with a structure:

/* Colour definition */
typedef struct{
        float red, green, blue;
}colour;

Why floats? We're going to need a way to represent the colour as a precentage. While one could calculate in absolute values of 0 to 255, it is easier to be able to say "colour this object with 50% red, 30% green and 10% blue". Floating points allow us to do that easily by specifying a value between 0.0 and 1.0, which represent the percentages.

As we said before, we need lights. Lights within this context have two things: a position in 3D space and an intensity per colour component. This we can represent as follows:

/* Light definition */
typedef struct{
        vector pos;
        colour intensity;
}light;

We want to be able to create a couple different materials and assign these to the sphere. A material in this context will be defines as an entity that can be assigned to an object, defining its reflectivity (0% - 100%) and diffuse colour. Diffuse colour will indicate the colour the object will reflect when being lit. Reflectivity will be used to determine how 'shiny' an object is and will act as a mirror for other objects. A material can be constructed as:

/* Material definition */
typedef struct{
    colour diffuse;
    float reflection;
}material;

Similarly, our sphere object can now become:

/* Sphere Primitive definition */
typedef struct{
        vector pos;
        float radius;
        int material;
}sphere;

Here, material is the id which will be used to select the assigned material from an array of materials. The idea now, simplified, is to shoot our ray into the scene, and should it hit the sphere, bounce it off of the sphere (reflect) and check if it hits the lightsource. If it does, this point on the sphere is lit, otherwise it is not.

We now have to develop a way to use these newly defined entities. Let's start with building up a scene. We have several spheres we want to show (instead of just the one from the previous example). We also want to position some lightsources, since one lightsource will not be sufficient to provide realistic lighting. For now we will go through the process of positioning them one by one, but later on it would be easier to use a text file describing the scene to set this up. We will start with three materials, three lightsources and three spheres, all in their proper arrays. I'm not going to paste the code for that here since it's quite long and repetitive, but you can see the code in question, together with the next couple of topics in raytrace_sphere_1.c attached to this post.

The next thing we should do is to revise our ray-sphere intersection function. The way it is currently written it will only return true or false indicating intersection or not. We should however be able to find out exactly where this point of intersection is. Moreover, in case there are two intersection points, we want to know the one closest to the screen. That is the one that we actually see - the other one is on the other side of the sphere, and not visible. To accomplish this, we can modify the function as follows:

/* Check if the ray and sphere intersect */
bool intersectRaySphere(ray *r, sphere *s, float *t){
       
        bool retval = false;

        /* A = d.d, the vector dot product of the direction */
        float A = vectorDot(&r->dir, &r->dir);
       
        /* We need a vector representing the distance between the start of
         * the ray and the position of the circle.
         * This is the term (p0 - c)
         */
        vector dist = vectorSub(&r->start, &s->pos);
       
        /* 2d.(p0 - c) */ 
        float B = 2 * vectorDot(&r->dir, &dist);
       
        /* (p0 - c).(p0 - c) - r^2 */
        float C = vectorDot(&dist, &dist) - (s->radius * s->radius);
       
        /* Solving the discriminant */
        float discr = B * B - 4 * A * C;
       
        /* If the discriminant is negative, there are no real roots.
         * Return false in that case as the ray misses the sphere.
         * Return true in all other cases (can be one or two intersections)
         *  t, the solution to the equation represents the closest point
         * where our ray intersects with the sphere.
         */
        if(discr < 0)
                retval = false;
        else{
                float sqrtdiscr = sqrtf(discr);
                float t0 = (-B + sqrtdiscr)/(2);
                float t1 = (-B - sqrtdiscr)/(2);
               
                /* We want the closest one */
                if(t0 > t1)
                        t0 = t1;

                /* Verify t0 larger than 0 and less than the original t */
                if((t0 > 0.001f) && (t0 < *t)){
                        *t = t0;
                        retval = true;
                }else
                        retval = false;
        }

return retval;
}

 The value 't' represents the length of the closest intersection. We can illustrate this as follows:

Revised intersection calculation

The value 't' is a scalar, and in order to find the vector $ \vec{P} $, we have to write a function which takes this scalar and multiplies it with our vector $ \vec{R} $ which is the direction of our ray. Multiplication of a vector with a scalar is just multiplying each component with the scalar. We know this already from part 1, in which we discussed parameterizing. Since the direction of our ray is always $ (0,0,1) $, the scalar multiplication will result in a scaled vector representing the distance from the start of our ray to the point of impact on the sphere. Adding this scaled vector to the start position of our ray will give the vector $ \vec{P} $. One thing I'd like to note here is the check whether t greater than 0, which is written as t0 > 0.001f in the code. The reason for using 0.001 instead of 0 has to do with floating point comparison and precision when representing floating point numbers. You can read more about this here.

We will need two additional functions in our arsenal to deal with the above: scaling and adding vectors:

/* Calculate Vector x Scalar and return resulting Vector*/
vector vectorScale(float c, vector *v){
        vector result = {v->x * c, v->y * c, v->z * c };
        return result;
}

/* Add two vectors and return the resulting vector */
vector vectorAdd(vector *v1, vector *v2){
        vector result = {v1->x + v2->x, v1->y + v2->y, v1->z + v2->z };
        return result;
}

Now that we have materials, spheres and lights, it's time to start working on the main objective. We need to simulate how light interacts with the spheres and the camera to build a realistic image. Each point on each sphere can be lit or unlit. The points that are lit can be lit with different intensities depending on how much light reaches the point in question. With our ray which we shoot through the screen will need to determine if it hits a point on one of the spheres, if light reaches this point. Furthermore, light reflects of off objects so from the impact point we reflect the ray and perhaps hit another sphere.

Our algorithm will look, simplified, like this:

for(HEIGHT){
  for(WIDTH){
      do for every ray:
         - Find closest ray/sphere intersection:
           * Iterate over every sphere

         - Check if we reach a lightsource from this point
           * Iterate over every lightsource
           * Find right colour

         - Either go with reflected ray or go to next pixel
  }
}

To get the colour and intensity of each pixel, we will apply Lambertian reflectance to model diffuse reflection whereby the light will be reflected in equally in all directions from the point where the light hits the object. This is a very simple model where: $ I_D = \vec{L} . \vec{N}CI_L $. Here, $ \vec{N} $ is the normal at the point of intersection, and $ \vec{L} $ denotes the normalized vector from the point of intersecton to the light source. $ C $ represents the colour of the sphere and $ I_L $ is the intensity of the incoming light. The resulting $ I_D $ scalar represents the surface brightness at that point.

We will program this later as follows:

- For each lightsource that can be reached:
     * Calculate Lambert dot product with material reflection
     * Calculate each colour component, consisting of:
- Lambert dot product result
- Per colour intensity of the incoming light

We then need to get the reflection of our incoming ray and see if we hit another sphere and repeat the process. Getting the reflected ray is simple according to the law of reflection :

Reflection 1

The angle between the incoming ray and the normal is equal to the angle between the reflected ray and the normal ($ \theta_r = \theta_i $). The normal on a point on a sphere is a vector which can be constructed by drawing it starting at the center of the sphere and going through our point of intersection. Mathematically, we can determine this when we present the situation as in this drawing:

Reflection 2

When we assume that $ \vec{P} $ and $ \vec{Q} $ are normalised, we know from basic trigonometry that:

$ |\vec{Q_1}| = cos(\theta_r) = cos(\theta_i) = |\vec{P_1}| $

$ |\vec{Q_2}| = sin(\theta_r) = sin(\theta_i) = |\vec{P_2}| $

From the drawing above, we can also see that:

$ \vec{Q_1} = -\vec{P_1} $ $ \vec{Q_1} = \vec{Q_1} $

After summation, we can get the direction:

$ \vec{Q} = \vec{Q_2} + \vec{Q_1} = \vec{P_2} - \vec{P_1} $

We can work this further out. We know that: $ \vec{Q} = \vec{P_2} - \vec{P_1} $.

Using orthogonal projection:

Projection

We know that $ \vec{proj_U(V)} = a\vec{U} $ for some scalar a. Since $ \vec{proj_U(V)} - \vec{V} $ is orthogonal to $ \vec{U} $, we have that $ (a\vec{U} - \vec{V}).\vec{U} = 0 $.

Multiply out the dot product and solving for a gives:

$ a\vec{U}.\vec{U} - \vec{V}.\vec{U} =  0 $

$ a = \vec{V}.\vec{U}/\vec{U}.\vec{U} $.

The result is that:

$ \vec{proj_U(V)} = (\frac{\vec{U}.\vec{V}}{\vec{U}.\vec{U}})\vec{U} $

From the definition of the dot product, we know that $ \vec{U}.\vec{U} = |\vec{U}|^2  $.  

Applying this to our situation gives that, with $ \vec{N} $ our Normal:

$ \vec{P1} = \frac{\vec{P}.\vec{N}}{|\vec{N}|^2}\vec{N} $

Knowing that $ |\vec{N}| = 1 $

This gives $ \vec{P1} = (\vec{P}.\vec{N})\vec{N} $

Coming back to: $ \vec{Q} = \vec{P_2} - \vec{P_1} $, we can solve this as follows:

$ \vec{Q} = [\vec{P} - (\vec{P}.\vec{N})\vec{N}] - (\vec{P}.\vec{N})\vec{N}) $

This can be solved further as:

$ \vec{Q} = \vec{P} - 2(\vec{P}.\vec{N})\vec{N}) $

That's it - we now have our reflected vector and continue to the next ray trace loop with this vector as a start.

The full C code can be found in the raytrace_sphere_2.c file attached. The main raytrace loop will continue to iterate over multiple reflections until a certain level (otherwise it might run forever). We can further enhance this program by adding shadows to the final image. Calculating shadows is pretty easy: one just has to trace the 'lightRay' found before determining the colour back to each object. If it intersects with another sphere, the point in question sits in the shadow of said sphere. The solution to this is in the attached raytrace_sphere_3.c.

Attachments

Comments

just a tiny bit more help needed

Hello !

First of all thank you for this awesome tutorial , it is greatly helping me in understanding how computer graphic works, and i am currently trying to use it to make my own raytracer. I cannot thank you enought for the progress i've made so far thanks to that tutorial.

However, as a pretty below average user in both math and programming , there is still a lot of things i fail to understand, i was wondering if by any chance you could provide and exemple code for just one other shape adapted to that tutorial ( i have tried looking at the github link you provided in the comments but it is far too advanced for me) just so i can compare to the difference with treating a sphere and ... something else ^^.

it's been a verry long time so i'm not sure if you still follow this, but i guess it's worth a try.

Thanks again !

RE: just a tiny bit more help needed

Hi,

Thanks - I'm glad you liked the tutorial!

If you look at the collideRayTriangle() function in the Github repo here: https://github.com/PurpleAlien/Raytracer/blob/master/triangle.c that should give you an idea. The reason for using a collideRayTriangle() is that 3D objects are often made up of polygons (triangles in principle), and thus we can right away use much more complex 3D objects when we can use those. This is the 3D spaceship you get when you run that code.

You can also find other intersections between a ray and primitive. for example a cone and a ray. For that, let me point you to this document:
https://www.geometrictools.com/Documentation/IntersectionLineCone.pdf

There are other possibilities of course, but a sphere and a cone are probably the easiest when it comes to primitives. Once you understand the ray/triangle intersection, you can visualize complex 3D objects like the one in my Github code. Naturally, none of this is really optimized for best performance - but there are lots of tutorials, books and so on written on that topic. I'm sure you'll come across those as you proceed.

Hope that helps!

 

Johan.

it works !

Thank you verry much ! i finally got it to work with triangles , cylindres and cones , thank to the code you provided above !

 

Thanks a lot for this great tutorial !

Re: it works !

Glad to hear that - have fun! :)

Johan.

thanks

Thanks for this nice tutorial. Now my goal is to be able to move the observation point. A friend told me that I have to use another projection type than orthogonal projection.
Gonna look into it.
I'll also try to get acquainted with SDL in order to make the scene animated. I was able to do it just in the terminal but since later we made ppm I was no longer able to animate the scene.

Re: thanks

Hi.

Glad you liked the tutorial!. You can find the final, expanded version of the raytracer here: https://github.com/PurpleAlien/Raytracer. This one also adds conic projection, so it can give you an idea on how to proceed with different projection methods. 

Johan.

It has been a year...

..now but I got back into this work and got the spheres moving using SDL, in a standalone window, kind of my first 3D animation.
The link to the video of it is here: https://vid.me/nTFS

SDL is really nice to do graphics for the first time. Here I am not using OpenGL with SDL, but it is something you can do (and should do). The spheres are supposed to move a lot faster and I was able to determine that the bottleneck is not SDL, so it must be the computations behind the display of the spheres. This is interesting so I will be looking into how to make it lighter or how to draw other things.

But so far, great success, thanks for your awesome tutorial.

It has been a year...

Nice! So glad to see this tutorial is still useful for people. 

Your speed issues are most definitely because of the ray tracing code. Ray tracing is very compute intensive, and there is still a lot of research in the field of 'real time ray tracing' (Google is your friend). The code in this tutorial was never optimized since I wanted to keep the conversion from the math to the code as straight forward as possible (to support a few courses I taught). You can definitely improve things and speed things up a lot!

Keep having fun!

Johan.