Modern Rendering Pipelines

In my previous post, OpenGL Quick Start, I created a barebones glfw render loop and added the absolute minimal triangle rendering logic using the immediate mode glBegin/glEnd style code and fixed function glColor3f and glVertex2D calls. I also described this as “old school” and the least efficient way possible. To understand what I meant by this, and what the modern practices are, requires a little bit of a history lesson and understanding of computer hardware.

Pipelines: How we got here

Without going TOO far back in history to simple text games and early graphics, the first version of OpenGL made a pretty literal translation of the basic text rendering loop: set color, print characters – but instead of characters it was: set color, draw triangles.

From a hardware perspective, the CPU has to send commands over the PCI bus to the video card, which is a slow process. To speed this up, video cards would use video card memory to store a buffer of what to draw and the CPU would send commands to update the video card memory. But keeping track of what the video card is displaying is so hard that rendering loops generally clear the whole screen and just send the whole scene again. Every frame. Did I mention the PCI bus is slow?

Enter OpenGL 2.0, which promoted Vertex Buffer Objects into the core OpenGL feature set. Vertex Buffer Objects allow the system to allocate memory in the most efficient place (system or video card), fill it up quickly, and re-use the data across multiple function calls and passes of your rendering loop. With the vertex data loaded into the video card, OpenGL also introduced the OpenGL Shader Language (GLSL), which allows developers to send tiny bits of executable instructions that performs operations on the vertex data within the video card itself. These powerful new features allowed OpenGL 2.0 to offload both Vertex and Fragment operations to the video card, greatly simplifying rendering loops so that a small amount of data could create complex scenes.

OpenGL 3.0 added a few features (and a formalized deprecation mechanism that got rid of things like glBegin/glEnd and all the fixed functions), but only introduced one major change to the rendering pipeline: Geometry Shaders. I’m at a loss for what is really amazing about Geometry Shaders, and it looks a bit like the industry at large was too because the examples I am finding online are just not that interesting. Essentially, given vertex data you can change it or add more geometry before sending to the next step in the rendering pipeline. This might be useful for some kind of particle effect, but if you are thinking, like I was, “hey, maybe this can be used to make wavy water” – the problem is the geometry shader cannot break triangles into smaller triangles, so you end up with very strange and low poly water with weird lighting artifacts.

Which brings us to OpenGL 4.0, which leaves Geometry Shaders in their place, but adds Tessellation and Compute shaders. Tessellation is the solution to the shortcomings of the Geometry Shader – it does allow for the subdivision of vertex data into smaller primitives and it unlocks some remarkable optimizations where low vertex models can be loaded into the GPU and turned into high polygon visualizations, or things like a smooth water surface can have waves roll across them.

I was reading right along through all of these rendering pipeline changes and everything was pretty obvious – vertex shader manipulates vertex data, fragment shader manipulates fragments of pixels being drawn, geometry shader badly manipulates vertex data, tessellation shader does really cool stuff with vertex data. But. Compute Shaders. These are just. Anything. A google search for examples turns up some amazing results, including this beauty: Computer Shaders are Nifty

Maybe in a future pots I will explore compute shaders…

Updating my render loop

The OpenGL library defaults to version 1.0, with the glBegin/glEnd style immediate mode rendering. In order to use newer features you have to use an Extension Loading Library. There are a few to choose from, and a lot of tutorials will use GLEW or GL3W, but I am going to use GLAD because it supports OpenGL, and OpenGL ES, has a powerful import generator, and is easily installed using Conan.

conanfile.py

    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
        "fmt/8.0.0",
        "glog/0.5.0",
        "glad/0.1.34",
        "glfw/3.3.4",
    ];

In order to use GLAD to create a new OpenGL context with access to the new features and rendering pipeline, you have to explicitly tell glfw which version of OpenGL you want to use. I’d like to make my code run on Windows, Mac, and Linux – but Mac only supports up to OpenGL 4.1 (it has deprecated OpenGL in favor of it’s own 3d library, Metal). I mentioned wanting to support OpenGL ES, which is an Embedded System (ie mobile type device) version of OpenGL and therefore has more limited hardware and does not support all the features of a full OpenGL system. Trying to figure out the intersection of both features is complex, and I am not going to worry about a full cross-platform solution at the moment.

The new minimal code to get a reasonably modern rendering pipeline using OpenGL 4.1 looks like this:

main.cpp

#include <fmt/core.h>
#include <glog/logging.h>

#include <glad/glad.h>
#include <GLFW/glfw3.h>

int main(int argc, const char *argv[]) {
    google::InitGoogleLogging(argv[0]);
    FLAGS_logtostderr = 1;
    LOG(INFO) << "Starting";

    if (!glfwInit())
    {
        LOG(ERROR) << "glfwInit::error";
        return -1;
    }

    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 4);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 1);

    GLFWwindow* window = glfwCreateWindow(640, 480, argv[0], NULL, NULL);
    if (!window)
    {
        LOG(ERROR) << "glfwCreateWindow::error";
        return -1;
    }

    glfwMakeContextCurrent(window);

    gladLoadGL();

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // reset the view
        glClear(GL_COLOR_BUFFER_BIT);
 
        // draw something

        // display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

I’ve added the glad library import, used glfwWindowHint to specify the version to use, and ran gladLoadGL to dynamically bind all the new OpenGL features at run time.

From here the “draw something” section should be expanded to follow the OpenGL Rendering Pipeline, which consists of the following steps:

  1. Vertex Processing
    1. Vertex Specification
    2. Apply Vertex Shaders
    3. Apply Tesselation Shaders (optional)
    4. Apply Geometry Shaders (optional)
  2. Vertex Post-Processing (optional)
    1. Transform Feedback
    2. Primitive Assembly
  3. Fragment Processing
    1. Generate Fragments
    2. Apply Fragment Shaders
  4. Fragment Post-Processing (optional)
    1. Perform operations like depth testing, stenciling, and masking

Following the breakdown above, you can see there are only two required steps: Applying vertex shaders to your geometry, and applying fragment shaders to generate output.

Define Vertex and Fragment Shaders

I will be following the rather dense OpenGL tutorial on vertex and fragment shaders and making things work with glfw, glad, and c++. Before anything can be drawn by the GPU we must define both a vertex and fragment shader.

For this basic test I’m just going to define a very basic vertex shader that takes in 2D points, and an even simpler fragment shader that only generates the color yellow.

    int  success;
    char infoLog[512];

    // create vertex shader
    const char *vertexShaderSource = R"shader(
        #version 330 core
        layout (location = 0) in vec2 in_pos;

        void main()
        {
            gl_Position = vec4(in_pos.x, in_pos.y, 0.0, 1.0);
        }
    )shader";

    unsigned int vertexShader;
    vertexShader = glCreateShader(GL_VERTEX_SHADER);

    glShaderSource(vertexShader, 1, &vertexShaderSource, NULL);
    glCompileShader(vertexShader);

    glGetShaderiv(vertexShader, GL_COMPILE_STATUS, &success);

    if(!success)
    {
        glGetShaderInfoLog(vertexShader, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::VERTEX::COMPILATION_FAILED\n" << infoLog;
        return -1;
    }

    // create fragment shader
     const char *fragmentShaderSource = R"shader(
        #version 330 core
        out vec4 FragColor;

        void main()
        {
            FragColor = vec4(1.0f, 1.0f, 0.0f, 1.0f);
        }
    )shader";

    unsigned int fragmentShader;
    fragmentShader = glCreateShader(GL_FRAGMENT_SHADER);

    glShaderSource(fragmentShader, 1, &fragmentShaderSource, NULL);
    glCompileShader(fragmentShader);

    glGetShaderiv(fragmentShader, GL_COMPILE_STATUS, &success);

    if(!success)
    {
        glGetShaderInfoLog(fragmentShader, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::FRAGMENT::COMPILATION_FAILED\n" << infoLog;
        return -1;
    }

Once the shaders are compiled, we need to tell OpenGL to create a “program” that runs the two shaders.

    // create shader program
    unsigned int shaderProgram;
    shaderProgram = glCreateProgram();

    glAttachShader(shaderProgram, vertexShader);
    glAttachShader(shaderProgram, fragmentShader);
    glLinkProgram(shaderProgram);

    glGetProgramiv(shaderProgram, GL_LINK_STATUS, &success);
    if(!success) {
        glGetProgramInfoLog(shaderProgram, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::PROGRAM::LINKING_FAILED\n" << infoLog;
        return -1;
    }

    glUseProgram(shaderProgram);

Vertex Specification

Modern OpenGL versions require you to define a vertex array object and then load your vertex data into a vertex buffer object.

       // define vertex data
    float vertices[][2] = {
        { 0.0f,  0.5f },
        {-0.5f, -0.5f },
        { 0.5f, -0.5f }
    };

    GLuint vao;
    glGenVertexArrays(1, &vao);
    glBindVertexArray(vao);
    
    GLuint vbo;
    glGenBuffers(1, &vbo);

    glBindBuffer(GL_ARRAY_BUFFER, vbo);
    glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
    glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
    glEnableVertexAttribArray(0);

Render Loop

Once we have defined our vertex and fragment shaders, created a program to run them, and loaded some data, our render loop now looks like this:

    LOG(INFO) << "RENDERLOOP::BEGIN";

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);
    
    glUseProgram(shaderProgram);

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // reset the view
        glClear(GL_COLOR_BUFFER_BIT);
 
        // draw something
        glDrawArrays(GL_TRIANGLES, 0, 3);

        // display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

    LOG(INFO) << "RENDERLOOP::END";

And, if your mother told you to clean up after yourself:

    ...

    glUseProgram(0);
    glDisableVertexAttribArray(0);
    glDetachShader(shaderProgram, vertexShader);
    glDetachShader(shaderProgram, fragmentShader);
    glDeleteProgram(shaderProgram);
    glDeleteShader(vertexShader);
    glDeleteShader(fragmentShader);
    glDeleteBuffers(1, &vbo);
    glDeleteVertexArrays(1, &vao);

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

Conclusion

That is certainly a LOT more code that the glBegin/glEnd block.

The main reason for switching to a newer version of OpenGL would be to take advantage of higher throughput operations using vertex array objects and their associated calls like glDrawArrays, as well as tapping into the full programmability of shaders.

In my next post I will expand on this basic rendering loop to display a full 3D cube to give us a better set of geometry to test out shaders with, and begin to add more features like keyboard/mouse control, better viewport management, and break the program down into functions.

mbrandeis