Modern Rendering Pipelines

In my previous post, OpenGL Quick Start, I created a barebones glfw render loop and added the absolute minimal triangle rendering logic using the immediate mode glBegin/glEnd style code and fixed function glColor3f and glVertex2D calls. I also described this as “old school” and the least efficient way possible. To understand what I meant by this, and what the modern practices are, requires a little bit of a history lesson and understanding of computer hardware.

Pipelines: How we got here

Without going TOO far back in history to simple text games and early graphics, the first version of OpenGL made a pretty literal translation of the basic text rendering loop: set color, print characters – but instead of characters it was: set color, draw triangles.

From a hardware perspective, the CPU has to send commands over the PCI bus to the video card, which is a slow process. To speed this up, video cards would use video card memory to store a buffer of what to draw and the CPU would send commands to update the video card memory. But keeping track of what the video card is displaying is so hard that rendering loops generally clear the whole screen and just send the whole scene again. Every frame. Did I mention the PCI bus is slow?

Enter OpenGL 2.0, which promoted Vertex Buffer Objects into the core OpenGL feature set. Vertex Buffer Objects allow the system to allocate memory in the most efficient place (system or video card), fill it up quickly, and re-use the data across multiple function calls and passes of your rendering loop. With the vertex data loaded into the video card, OpenGL also introduced the OpenGL Shader Language (GLSL), which allows developers to send tiny bits of executable instructions that performs operations on the vertex data within the video card itself. These powerful new features allowed OpenGL 2.0 to offload both Vertex and Fragment operations to the video card, greatly simplifying rendering loops so that a small amount of data could create complex scenes.

OpenGL 3.0 added a few features (and a formalized deprecation mechanism that got rid of things like glBegin/glEnd and all the fixed functions), but only introduced one major change to the rendering pipeline: Geometry Shaders. I’m at a loss for what is really amazing about Geometry Shaders, and it looks a bit like the industry at large was too because the examples I am finding online are just not that interesting. Essentially, given vertex data you can change it or add more geometry before sending to the next step in the rendering pipeline. This might be useful for some kind of particle effect, but if you are thinking, like I was, “hey, maybe this can be used to make wavy water” – the problem is the geometry shader cannot break triangles into smaller triangles, so you end up with very strange and low poly water with weird lighting artifacts.

Which brings us to OpenGL 4.0, which leaves Geometry Shaders in their place, but adds Tessellation and Compute shaders. Tessellation is the solution to the shortcomings of the Geometry Shader – it does allow for the subdivision of vertex data into smaller primitives and it unlocks some remarkable optimizations where low vertex models can be loaded into the GPU and turned into high polygon visualizations, or things like a smooth water surface can have waves roll across them.

I was reading right along through all of these rendering pipeline changes and everything was pretty obvious – vertex shader manipulates vertex data, fragment shader manipulates fragments of pixels being drawn, geometry shader badly manipulates vertex data, tessellation shader does really cool stuff with vertex data. But. Compute Shaders. These are just. Anything. A google search for examples turns up some amazing results, including this beauty: Computer Shaders are Nifty

Maybe in a future pots I will explore compute shaders…

Updating my render loop

The OpenGL library defaults to version 1.0, with the glBegin/glEnd style immediate mode rendering. In order to use newer features you have to use an Extension Loading Library. There are a few to choose from, and a lot of tutorials will use GLEW or GL3W, but I am going to use GLAD because it supports OpenGL, and OpenGL ES, has a powerful import generator, and is easily installed using Conan.

conanfile.py

    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
        "fmt/8.0.0",
        "glog/0.5.0",
        "glad/0.1.34",
        "glfw/3.3.4",
    ];

In order to use GLAD to create a new OpenGL context with access to the new features and rendering pipeline, you have to explicitly tell glfw which version of OpenGL you want to use. I’d like to make my code run on Windows, Mac, and Linux – but Mac only supports up to OpenGL 4.1 (it has deprecated OpenGL in favor of it’s own 3d library, Metal). I mentioned wanting to support OpenGL ES, which is an Embedded System (ie mobile type device) version of OpenGL and therefore has more limited hardware and does not support all the features of a full OpenGL system. Trying to figure out the intersection of both features is complex, and I am not going to worry about a full cross-platform solution at the moment.

The new minimal code to get a reasonably modern rendering pipeline using OpenGL 4.1 looks like this:

main.cpp

#include <fmt/core.h>
#include <glog/logging.h>

#include <glad/glad.h>
#include <GLFW/glfw3.h>

int main(int argc, const char *argv[]) {
    google::InitGoogleLogging(argv[0]);
    FLAGS_logtostderr = 1;
    LOG(INFO) << "Starting";

    if (!glfwInit())
    {
        LOG(ERROR) << "glfwInit::error";
        return -1;
    }

    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 4);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 1);

    GLFWwindow* window = glfwCreateWindow(640, 480, argv[0], NULL, NULL);
    if (!window)
    {
        LOG(ERROR) << "glfwCreateWindow::error";
        return -1;
    }

    glfwMakeContextCurrent(window);

    gladLoadGL();

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // reset the view
        glClear(GL_COLOR_BUFFER_BIT);
 
        // draw something

        // display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

I’ve added the glad library import, used glfwWindowHint to specify the version to use, and ran gladLoadGL to dynamically bind all the new OpenGL features at run time.

From here the “draw something” section should be expanded to follow the OpenGL Rendering Pipeline, which consists of the following steps:

  1. Vertex Processing
    1. Vertex Specification
    2. Apply Vertex Shaders
    3. Apply Tesselation Shaders (optional)
    4. Apply Geometry Shaders (optional)
  2. Vertex Post-Processing (optional)
    1. Transform Feedback
    2. Primitive Assembly
  3. Fragment Processing
    1. Generate Fragments
    2. Apply Fragment Shaders
  4. Fragment Post-Processing (optional)
    1. Perform operations like depth testing, stenciling, and masking

Following the breakdown above, you can see there are only two required steps: Applying vertex shaders to your geometry, and applying fragment shaders to generate output.

Define Vertex and Fragment Shaders

I will be following the rather dense OpenGL tutorial on vertex and fragment shaders and making things work with glfw, glad, and c++. Before anything can be drawn by the GPU we must define both a vertex and fragment shader.

For this basic test I’m just going to define a very basic vertex shader that takes in 2D points, and an even simpler fragment shader that only generates the color yellow.

    int  success;
    char infoLog[512];

    // create vertex shader
    const char *vertexShaderSource = R"shader(
        #version 330 core
        layout (location = 0) in vec2 in_pos;

        void main()
        {
            gl_Position = vec4(in_pos.x, in_pos.y, 0.0, 1.0);
        }
    )shader";

    unsigned int vertexShader;
    vertexShader = glCreateShader(GL_VERTEX_SHADER);

    glShaderSource(vertexShader, 1, &vertexShaderSource, NULL);
    glCompileShader(vertexShader);

    glGetShaderiv(vertexShader, GL_COMPILE_STATUS, &success);

    if(!success)
    {
        glGetShaderInfoLog(vertexShader, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::VERTEX::COMPILATION_FAILED\n" << infoLog;
        return -1;
    }

    // create fragment shader
     const char *fragmentShaderSource = R"shader(
        #version 330 core
        out vec4 FragColor;

        void main()
        {
            FragColor = vec4(1.0f, 1.0f, 0.0f, 1.0f);
        }
    )shader";

    unsigned int fragmentShader;
    fragmentShader = glCreateShader(GL_FRAGMENT_SHADER);

    glShaderSource(fragmentShader, 1, &fragmentShaderSource, NULL);
    glCompileShader(fragmentShader);

    glGetShaderiv(fragmentShader, GL_COMPILE_STATUS, &success);

    if(!success)
    {
        glGetShaderInfoLog(fragmentShader, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::FRAGMENT::COMPILATION_FAILED\n" << infoLog;
        return -1;
    }

Once the shaders are compiled, we need to tell OpenGL to create a “program” that runs the two shaders.

    // create shader program
    unsigned int shaderProgram;
    shaderProgram = glCreateProgram();

    glAttachShader(shaderProgram, vertexShader);
    glAttachShader(shaderProgram, fragmentShader);
    glLinkProgram(shaderProgram);

    glGetProgramiv(shaderProgram, GL_LINK_STATUS, &success);
    if(!success) {
        glGetProgramInfoLog(shaderProgram, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::PROGRAM::LINKING_FAILED\n" << infoLog;
        return -1;
    }

    glUseProgram(shaderProgram);

Vertex Specification

Modern OpenGL versions require you to define a vertex array object and then load your vertex data into a vertex buffer object.

       // define vertex data
    float vertices[][2] = {
        { 0.0f,  0.5f },
        {-0.5f, -0.5f },
        { 0.5f, -0.5f }
    };

    GLuint vao;
    glGenVertexArrays(1, &vao);
    glBindVertexArray(vao);
    
    GLuint vbo;
    glGenBuffers(1, &vbo);

    glBindBuffer(GL_ARRAY_BUFFER, vbo);
    glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
    glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
    glEnableVertexAttribArray(0);

Render Loop

Once we have defined our vertex and fragment shaders, created a program to run them, and loaded some data, our render loop now looks like this:

    LOG(INFO) << "RENDERLOOP::BEGIN";

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);
    
    glUseProgram(shaderProgram);

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // reset the view
        glClear(GL_COLOR_BUFFER_BIT);
 
        // draw something
        glDrawArrays(GL_TRIANGLES, 0, 3);

        // display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

    LOG(INFO) << "RENDERLOOP::END";

And, if your mother told you to clean up after yourself:

    ...

    glUseProgram(0);
    glDisableVertexAttribArray(0);
    glDetachShader(shaderProgram, vertexShader);
    glDetachShader(shaderProgram, fragmentShader);
    glDeleteProgram(shaderProgram);
    glDeleteShader(vertexShader);
    glDeleteShader(fragmentShader);
    glDeleteBuffers(1, &vbo);
    glDeleteVertexArrays(1, &vao);

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

Conclusion

That is certainly a LOT more code that the glBegin/glEnd block.

The main reason for switching to a newer version of OpenGL would be to take advantage of higher throughput operations using vertex array objects and their associated calls like glDrawArrays, as well as tapping into the full programmability of shaders.

In my next post I will expand on this basic rendering loop to display a full 3D cube to give us a better set of geometry to test out shaders with, and begin to add more features like keyboard/mouse control, better viewport management, and break the program down into functions.

OpenGL Quick Start

In my previous post, Game Engine Adventures, I talked about why I would even attempt to write my own game engine (for fun!) and how I set up my environment.

In this post I am going to start with the OpenGL equivalent of Hello World, a window with an OpenGL context that displays a simple image.

Package Requirements

I’ve added additional packages to my conanfile: glog (for nice python-like logging for C++), and glfw for easy window creation and input management.

conanfile.py

    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
        "glog/0.5.0",
        "glfw/3.3.4",
    ]

Minimal glfw loop

The glfw documentation claims it “Gives you a window and OpenGL context with just two function calls“. This is a pretty bold claim, and while technically it may be true, it is also a completely non-functional application that creates a window and simply exits.

The real minimal glfw loop looks like this:

main.cpp

#include <glog/logging.h>
#include <GLFW/glfw3.h>

int main(int argc, const char *argv[]) {
    google::InitGoogleLogging(argv[0]);
    FLAGS_logtostderr = 1;
    LOG(INFO) << "Starting";

    if (!glfwInit())
    {
        LOG(ERROR) << "glfwInit::error";
        return -1;
    }

    GLFWwindow* window = glfwCreateWindow(640, 480, argv[0], NULL, NULL);
    if (!window)
    {
        LOG(ERROR) << "glfwCreateWindow::error";
        return -1;
    }

    glfwMakeContextCurrent(window);

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);

    while (!glfwWindowShouldClose(window))
    {
        // Clear the view
        glClear(GL_COLOR_BUFFER_BIT);

        // Render something
        // TODO...

        // Display output
        glfwSwapBuffers(window);
        glfwPollEvents();
    }

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

This will initialize glog to write to the console, iniatlize glfw and create a window (the “two line” claim?), setup the OpenGL context to render to, begin a loop waiting for the window to close, and cleanup glfw before exiting.

If you run this code it should open a plain black window that can be rendered to. It’s not very sexy.

Measuring Greatness

That’s great to sanity check, but not very interesting. Let’s add a Frames Per Second counter in the title bar. I happen to really like Python fstrings, but I’m not ready to jump into c++ 20 to take advantage of std::format. The good news is that std::format is based on fmtlib, so I’m just going to add that to my conanfile.py

conanfile.py

    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
        "fmt/8.0.0",
        "glog/0.5.0",
        "glfw/3.3.4",
    ]

Now we can update our main render loop to measure how many frames we can render in 1 second. The code here uses glfwGetTime as a more portable clock mechanism, counts how many frames have been rendered, and if the elapsed time is greater than 1 second it updates the titlebar using fmt::format. The obscure “{:>5.0f}” string is fmtlib’s syntax for padding the string with some leading spaces so the fps counter does move all over the place

Did I just pull in a library just to avoid sprintf or something equivalent? Yes and no. Check out the fmtlib benchmarks and you will see it outperforms many libraries, and it has great formatting options that we can take advantage of later.

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // Clear the view
        glClear(GL_COLOR_BUFFER_BIT);

        // Render something
        // TODO...

        // Display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        // Calculate FPS
        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

Old School Cool

Before diving into new OpenGL features like Vertex Buffer Objects, Vertex Shaders, Mesh Shaders, and general rendering pipeline complexity, let’s just sanity check ourselves by replacing the TODO block with a very basic and old school glBegin/glEnd block using glVertex2f immediate mode calls.

        // Draw a triangle
        glBegin(GL_TRIANGLES);
            glColor3f(1.0, 0.0, 0.0);
            glVertex2f(0, 0.5);
            glColor3f(0.0, 1.0, 0.0);
            glVertex2f(-0.5, -0.5);
            glColor3f(0.0, 0.0, 1.0);
            glVertex2f(0.5, -0.5);
        glEnd();

Conclusion

This is very far from a game. It literally outputs a triangle in the least efficient way possible. In my next post I will begin to explore the capabilities of new OpenGL versions and try to benchmark some of the different rendering methods.

Next Post: Modern Rendering Pipelines

Game Engine Adventures

I don’t recommend re-inventing the wheel. So why would I develop my own game engine instead of using one of the many available ones? Because I was working on an engine 20 years ago, before all these shiny new engines were freely available, and it has always been a passion of mine. And because I am not actually planning on making a game engine, but rather I want to learn what is new and different in the world of C++ development, and figured I’d learn it within the context of implementing a game engine.

Package Management

One thing that I have always found frustrating when developing C++ applications is package management. Fortunately there are a couple of modern solutions for C++ that work like npm, pip, or other package managers. The two most popular that I saw were vcpkg and conan.

vcpkg

I briefly looked at vcpkg. It does seem to support Windows, Mac, and Linux. But it is a Microsoft solution, so there is distaste for tying myself to it. And the number of packages it supports appears to be far less than conan, therefore I didn’t actually do any testing with it purely based on personal preferences.

conan.io

This looks like a promising solution. It supports multiple package repositories so you can add tens of thousands of different packages to your system, if you trust the sources. By default conan.io hosts packages on a JFrog Artifactory backend, which I currently use at work, and have a high level of confidence in. The packages I am interested in using for development seem to be available by default, without needing to add additional repositories. And, it supports a simple conanfile.txt configuration, or a more powerful conanfile.py configuration as code solution. I use python daily, so this is quite appealing to me.

A quick configuration file looks like this:

conanfile.py

from conans import ConanFile, CMake

class Dread(ConanFile):
    settings = "os", "compiler", "build_type", "arch"
    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
    ]
    generators = [
        "virtualenv",
        "cmake",
        "gcc",
        "txt"
    ]
    default_options = {}

    def build(self):
        cmake = CMake(self)
        cmake.configure()
        cmake.build()

My initial requires section installs msys2 and cmake on my Windows laptop, which I normally just use for gaming since I prefer to do my daily development on a Mac. Eventually I would like to explore cross-compilation, but don’t want to slow myself down by doing too much at once.

The generators section tells conan what kind of outputs to generate using the requires (and other) sections, and the build function tells conan how to build using cmake.

With that conanfile.py in place I can simply install the dependencies and build my code with like this:

cd build
conan install ..
conan build ..

I think the .. path style is an odd choice on the part of conan, but it is cleaner looking that the alternative “conan install . –install-folder build”.

Build Tools

Before I selected CMake I was looking at a few options. The top choices in my mind were: CMake, SCons, and Bazel.

SCons

This was originally brought to my attention by a couple of projects that I was familiar with and thought about giving it a try. After spending some time familiarizing myself with it, I became more and more aware of the lack of support behind it. First of all, it was more popular a few years ago and has been in a steady decline since 2017. The second factor against it was, as I was looking at other tools, the general support for SCons was “nobody uses it so we haven’t really tested this” and comments from other blogs saying “if you are supporting legacy systems then use it, if you are starting a new project then don’t use it”. So I moved on to checking out other tools.

Bazel

I thought it might be nice to check out Bazel because, well, Google. But, again, looking at trends, it appears there has been a HUGE drop off in Bazel popularity. I don’t know if this is purely pandemic related, or if there are other factors at play, but it looks like its popularity has died off in the last year. Combine that with the fact that it is written in Java and I’m not really interested in adding Java requirements to this project, I moved on from Bazel.

CMake

I made a bit of a long circularly loop to get back to CMake. But it really came down using something with minimal overhead, wider support, and introducing the least extra tooling and complexity. CMake is at least somewhat similar to traditional Makefiles (more so than SCons or Bazel because it generates Makefiles), and getting started using it with conan is well supported and very straight forward.

CMakeLists.txt

cmake_minimum_required(VERSION 3.20.4)
project(Dread)

include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake)
conan_basic_setup(NO_OUTPUT_DIRS)

link_libraries(
    ${CONAN_LIBS}
)

add_executable(Dread src/main.cpp)

The conan cmake generator creates a conanbuildinfo.cmake file that contains ALL your header and library definitions for the packages you’ve installed with conan. This makes it very easy to include them in your CMakeLists.txt file and quickly have a working build.

Conclusion

With package management and a build system in place I was able to easily compile my main.cpp source file and run a simple SDL demo to prove my development environment worked.

In my next post I will cover how I have set up my IDE, project directory structure, and the initial entry point for the engine.

Next Post: OpenGL Quick Start

Workaround for gmail error “downloading this attachment is disabled”

Workaround for gmail error “downloading this attachment is disabled”

I wanted to download an attachment that was sent to me years ago, but now gmail is blocking the download of files with attachments it doesn’t like and gives the user no way to download it. Their helpful advice was to ask the original sender to upload it to Google Drive. Not helpful if the original sender doesn’t have the file because it is a decade old.

Workaround

Gmail has your data right there, it just thinks it knows better than you. Maybe this helps protect some people. But I just want my file.

 

The first thing to do is view the original source by clicking on the three dots at the top right of the email

The click the Download Original link and open the file in a text editor. At the top of the file you will find the message header and body, and just after that you will find the attachments are encoded as text using base64 encoding. Each attachment will look like a block like this:

--bcaec554d754b0f76a04d9fda578--
--bcaec554d754b0f77204d9fda57a
Content-Type: application/pdf; name="test.zip"
Content-Disposition: attachment; filename="test.zip"
Content-Transfer-Encoding: base64
X-Attachment-Id: 123456789_0.1

d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzPyB3aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz8K

--bcaec554d754b0f76a04d9fda578--

Either copy and past the content between the X-Attachment and before the end of record into a new file, or for very large attachments delete everything except the encoded attachment.

On a Mac/Unix environment you can use the base64 program to decode the attachment

 

cat encoded.txt | base64 --decode > test.zip
unzip test.zip