Working on a 3D graphics “engine” side project
Demo scene capture

Working on a 3D graphics “engine” side project

How it started

A few months before the Unity runtime fee disaster, I was working on some tools that would help us with our Unity projects. This was supposed to be a long-term project that would have a great impact on our future 3D projects. All that changed after the runtime fee was first announced. Since the tools I was working on were still in their infancy, I had the freedom to look for alternatives.

I tested Unreal Engine in a weekend and a few weekday evenings and although I liked it quite a bit, it seemed too heavy for our projects. Using Unreal for our project seemed like using a bulldozer to break an egg.

The next thing I tried was Raylib, which felt pretty fun. I usually work on native mobile apps and OOP is the name of the game. I tried Raylib in C which has no OOP and was pretty refreshing to do something different for a change. It sort of felt like I was back in high-school working on my programming homework. After making a simple 3D scene with an animated character moving through a maze, I tried adding shadows. At this point I realised that even though I know the theory behind shadow mapping, I couldn’t make it happen. I just didn't have enough knowledge to understand everything that’s going on (how the camera works, how things really get rendered on screen etc).

Because I liked Raylib so much, I didn’t bother testing the next engine on the list which was Godot. Instead, I started building everything from the ground up. The theory was that armed with all the knowhow I would gather along the way, I would be able to finally implement shadows using shadow maps. Spoiler alert: It started more as a learning experience and turned into an experiment to see if we can replace 3rd party engines with our own thing (for our internal 3D projects).

Choices

I wanted to go as low level as I could, in a language that has proven itself over the years. Since this was mainly about learning and expanding my understanding, low level was the way to go.

I wanted to work in C (like Raylib), but I decided to work in C++, mainly for the operator overloading. I wanted to be able to add two vectors by writing v1+v2 instead of addVector(v1, v2). I wouldn’t be using any other C++ features, so it would be more of a C-style C++. The reason is that I wanted to keep things as simple as possible with the least amount of overhead (also the generated ASM code would be easier for me to associate with the original code, thus understanding what’s really going on in CPU land). I would refrain from using many 3rd party libraries or much of the standard library for that matter.

General structure overview

The main idea is to write a base layer that does most of the low level stuff which we could then copy/paste into other projects. Each project will need its own “engine” and parts of it could be copied into future “engines”. What I’m trying to say is that I’m not writing a new Unity, I’m only writing the minimum feature-set that we need. It’s basically a small collection of files with a bunch of functions in them (travelling back in time to the 90s I guess).

I’m writing it to work on an Apple Silicon MacBook and made a basic Mac App project with a ViewController and an OpenGLView. The app gives our base layer all the info it needs: the OpenGL context, window resolution, keyboard input, mouse cursor position etc. By doing it this way, the “engine” doesn’t care what OS or platform it’s running on, it is the job of the platform code to hand over all the needed params.

I only use two 3rd party libraries: stb_truetype for font rendering and stb_image to load images of various formats. I didn’t want to dive into parsing *.ttf files and all the image formats myself.

The image parsing is a lot of hard labor without many interesting things going on. That’s one of the reasons most people including Raylib use stb_image for that.

The *.ttf parsing is as boring as the image parsing but the font rendering is a very interesting problem to solve. Since this is a side-project and I have very limited time, I decided to use this library for that. I needed fonts quite early, to display the frame-rate and draw calls.


Article content
Zoomed in screenshot of the basic stats

For the moment I’m using OpenGL because there’s a lot of learning material online. This is not the first time I’m using OpenGL, but the last time I did it was more than 10 years ago and I was using the fixed pipeline OpenGL ES on a Samsung Galaxy S Plus Android phone (among the first Samsung Galaxy phones). The modern OpenGL (which is deprecated by the way, Vulkan being the new-ish recommended API) is much more powerful and flexible and there was a lot to learn.

All the rendering code is abstracted in a Renderer file which in turn calls OpenGL code that’s stored in a different file. The plan is to also add Metal as a rendering API option in its own file. Switching a preprocessor flag will make the renderer call Metal instead of OpenGL. The render code does basic stuff via functions like DrawStaticText, DrawDynamicText, DrawBatchedQuads, DrawImmediateQuad etc. So whenever you draw something on the screen you don’t care if it’s OpenGL, Metal, Vulkan, DirectX or anything else. For now we only plan to add Metal besides the existing OpenGL. There will be some annoyances like having to write the same shaders twice: once for OpenGL and once for Metal. Usually people have their own shading language which gets converted to whatever they need. Since we don’t have time for that, we’ll start by writing shaders twice (we won’t have that many shaders in the beginning anyway).

Things we built so far

One of the first things I had to build was a hash map, since we’re not using the standard library. I just wrote a basic open addressing hash map that we’re using for profiling code. We try to just use basic arrays for everything we can get away with, so the hash map is only used to gather debug data (at least for now).

I also wrote a collection of math functions, since we require a lot of vector and matrix math. It is very basic at the moment with no optimisations, we can improve it quite a bit by using SIMD for matrix multiplication and other operations, but at the moment it’s good enough the way it is.

The more (sort of) complex thing that we built are custom allocators. Besides cache friendliness, the reason for doing this was to have a clearer understanding of the memory lifetime as well as avoiding memory leaks. Our allocators never call malloc or new, they directly call mmap to allocate pages of memory. We have 3 types of allocators which I call arenas even though technically only the linear allocator can be called an “arena”. You can learn more about this type of custom allocators by checking out this and this.

The first of them is the LinearArena. It works by allocating a chunk of memory upfront (let’s say for example 2MB) and everytime we need some memory from this arena we return a pointer to the required memory address and we increment a counter. At the start, when the arena is empty, this counter is zero. After we allocate something like , let’s say, a 2KB variable we increase that counter by 2 * 1024 (sort of, because we also need to account for memory alignment) thus we arrive at the next address where the next variable can get its memory from.You never decrement that counter except when you’re ready to flush the whole arena. This is good in a realtime application like a game. If you have a 60 frames per second game and you need some memory each frame, to let’s say, update some positions, you can use this LinearArena to get all the memory you need. At the end of the frame you just reset the counter to zero and reuse the same memory on the next frame. This is very light weight, with very little overhead and you don’t have to bother the OS for new memory. (I know that technically malloc doesn’t bother the OS all the time and has more sophisticated memory management but that’s not the point.)

The second custom allocator is the StackArena. This is exactly what it sounds like, it’s like the LinearArena with the benefit of being able to decrement the counter. We have a Push/Pop dynamic here. This is useful for reading small files into memory and other situations in which you need a bit of memory for a little while and then you discard it. You can see a sample in the image below.


Article content
StackArena usage to read a file into memory

You could argue that a much better approach would be to check the result after this function returns and do a POP there. This already happens for the successful result, but maybe it’s cleaner to also do it for other cases. It’s by no means perfect, everything is work in progress.

The last custom allocator is the FreeListArena which is basically a custom malloc. The advantage is that we know exactly what’s happening with the memory unlike malloc. This allocator is used in the situations when neither the LinearArena or StackArena can be used.

The simpler the allocator the better (less overhead), that’s why we try to use LinearArenas wherever possible, jump to the StackArena when it makes sense and only use the FreeList one when it’s the only option that works for a specific use-case.

All these allocators take memory alignment into account for faster memory access (better put: to avoid slow memory access). This part complicated things a bit because we had to make sure that all the addresses are properly aligned. This alignment is CPU dependent so we have to take it into account when making builds for certain architectures. We will also have to change mmap to VirtualAlloc if we plan to support Windows, but that’s nothing that can’t be handled with preprocessor flags.

Besides knowing which data is in which arena and knowing the lifetimes of all of it, I also built some rudimentary debug tools that lets us view all the arenas and their load. In the video below you can see how it looks like. The FreeListArena screen has a test arena at the top with holes in it so we can see how a fragmented FreeListArena would look like.

Memory Dashboard Demo

Showcase

Now it’s time to show you what can we do with our tech so far.

We can display text (amazing, I know). There are two types of text rendering: dynamic and static. The dynamic text rendering renders each character individually which means we have to make a bunch of draw calls. The static text rendering uses a single draw call for the whole text. Each character is generated on the fly and stored in a cache (which like almost everything is just an array).

We can display quads either with a solid color or with a texture (we also support tiling textures). Just for fun I made some snowfall with quads (I advise you to sit down before viewing this, just in case you get overwhelmed and confuse it with real life).

Snow demo using dynamic batched quads

Also some interesting particle FX but unfortunately the video compression messes it up (It looks much cooler live).

Particles in a square shape

I then finally worked my way up to the third dimension. The first thing was learning about all the space transforms: model to world, world to view, view to clip and others. Then I finally added a free moving 3D camera (even though cameras don’t really exist, it’s just matrices all the way down). At this point the UI is using an “UI Camera” and all the 3D stuff is using a “3D Camera”. You can see a demo of the free moving 3D camera below.

Free 3D Camera Demo

Afterwards I implemented lighting. At the moment we support only a single directional light source because we don’t need more for now. We use a simple Blinn lighting model because it’s good enough for our needs and it’s also as lightweight as you can get. We have ambiental lighting (a simple constant), diffuse and specular lighting.

Blinn Lighting Model Demo

Finally I arrived at the place where I was stuck with Raylib: shadow mapping (sort of, because importing 3D models and animations are not yet supported).

After a few attempts I managed to make shadows work, you can see it in the video below. I also added a debug panel that shows the current state of the depth map from the light’s point of view.

Shadow Demo

The recurring theme of all the 3D work was moving between spaces. Most of the issues that I encountered along the way, were related to not being in the correct space. It is very important to have everything in the same space when computing light-object interaction or in this case comparing the depth buffer with the depth map. It doesn’t matter if it’s world space, model space, view space, light space or whatever other space, it just has to be in the same space for any of it to make sense. Although, sometimes you can squeeze some performance by choosing a certain space that requires fewer overall computations (but like everything else: it depends).

Because we first render to a framebuffer, before displaying it on the screen, we can also do post-processing. This is just a fancy way of saying that you can use a fragment shader on the final image before displaying it. You can see an inverted colors post-processing effect below,

Inverted Colors Demo

as well as a distorted image post-processing effect.

Distorted Image Demo

The latest addition is textured boxes with normal maps, which required changing some of the rendering code. The camera movement is not ideal as it moves in large steps and doesn’t have acceleration/deceleration but it’s just for debug purposes anyway.

Textured box with normal map

Learnings

To say that the initial goal of this project, which was learning, was reached is an understatement. I now have a deeper understanding on how both CPUs and GPUs work and I have a greater appreciation for some of the things that are going on behind the scenes in all those commercial engines.

When I do native mobile development, on both iOS and Android, I can look at the code with fresh eyes and new understanding. I can also think about how would I implement some of the underlying systems that we take for granted. So in a way, doing this side project helped me grow and be better at my main work. I have a great appreciation for how easy we mobile devs have it with all those features, data structures and rendering coming out of the box. At least when it comes to most of the UI, we’re just gluing things together.

Future plans

I’ll continue working on this as a side project in the long run. We’ll replace the 3D part from Agility 3D with our own tech. A3D is a perfect contender for this, because we know what the final result should be and it has no sound (sound is a whole new area that requires a lot of work).

The next step is to load 3D models and animations inside our base layer. Afterwards I’ll start working on an A3D engine and update the base layer along the way.

There is also quite a bit of bug-fixing that needs to happen.

Conclusions

Even though it took around 14 months to reach this point in the project, I only worked on this on some weekends and some weekday evenings. I estimate that I put around between 3 to 4 weeks of total work into it.

At this point I think I can dive into working with 3rd party engines easier than I used to. I now understand a bunch of the underlying systems. But as I mentioned that’s not the plan.

So, why will I continue to work on our own “engines”?

  1. It’s a lot of fun for me. There’s something liberating on having control of most of the code and knowing that if any issue arrises it’s your fault with 99.99% certainty.
  2. I’m learning more things than I ever did and I’m experiencing great professional growth that spans beyond 3D graphics.
  3. We write the minimum amount of code that we need, the way we need it. All 3rd party engines have a bunch of features that we’ll never ever use and we have to conform to their constraints in order to write what we want.
  4. We don’t have to download a new engine version in order to update some plugins whenever Apple or Google wants us to use a new API or else the app will go down. Usually this update was accompanied by a magenta screen (those who know, know).
  5. I’m curious to see how far can we get in the long run.

I hoped you enjoyed reading this article and you can find more on our website.

Ilya Rabchynski

Software Engineer – Frontend/Fullstack

5d

That’s impressive! Are you planning to open source it? Also, I really like that the memory arenas approach becoming more and more popular (haven’t tried myself though)

Azat UGURLU

Senior Mobile Engineer

1w

Noice

To view or add a comment, sign in

More articles by Razvan Soare

  • How we used testing in our Unity3D game

    In the following lines I’ll try to explain what Unity tests are, how to build them and how we used them in our puzzle…

  • What’s the deal with Big O notation?

    If you are a programmer, even if you are a beginner, chances are you’ve heard the term Big O being mentioned. So what…

  • How to embed Unity 3D in a native Android app

    I‘m currently working on a project that requires embedding Unity 3D in a native Android app. In this article I’ll tell…

    5 Comments

Insights from the community

Others also viewed

Explore topics