Modern Rendering Pipelines

In my previous post, OpenGL Quick Start, I created a barebones glfw render loop and added the absolute minimal triangle rendering logic using the immediate mode glBegin/glEnd style code and fixed function glColor3f and glVertex2D calls. I also described this as “old school” and the least efficient way possible. To understand what I meant by this, and what the modern practices are, requires a little bit of a history lesson and understanding of computer hardware.

Pipelines: How we got here

Without going TOO far back in history to simple text games and early graphics, the first version of OpenGL made a pretty literal translation of the basic text rendering loop: set color, print characters – but instead of characters it was: set color, draw triangles.

From a hardware perspective, the CPU has to send commands over the PCI bus to the video card, which is a slow process. To speed this up, video cards would use video card memory to store a buffer of what to draw and the CPU would send commands to update the video card memory. But keeping track of what the video card is displaying is so hard that rendering loops generally clear the whole screen and just send the whole scene again. Every frame. Did I mention the PCI bus is slow?

Enter OpenGL 2.0, which promoted Vertex Buffer Objects into the core OpenGL feature set. Vertex Buffer Objects allow the system to allocate memory in the most efficient place (system or video card), fill it up quickly, and re-use the data across multiple function calls and passes of your rendering loop. With the vertex data loaded into the video card, OpenGL also introduced the OpenGL Shader Language (GLSL), which allows developers to send tiny bits of executable instructions that performs operations on the vertex data within the video card itself. These powerful new features allowed OpenGL 2.0 to offload both Vertex and Fragment operations to the video card, greatly simplifying rendering loops so that a small amount of data could create complex scenes.

OpenGL 3.0 added a few features (and a formalized deprecation mechanism that got rid of things like glBegin/glEnd and all the fixed functions), but only introduced one major change to the rendering pipeline: Geometry Shaders. I’m at a loss for what is really amazing about Geometry Shaders, and it looks a bit like the industry at large was too because the examples I am finding online are just not that interesting. Essentially, given vertex data you can change it or add more geometry before sending to the next step in the rendering pipeline. This might be useful for some kind of particle effect, but if you are thinking, like I was, “hey, maybe this can be used to make wavy water” – the problem is the geometry shader cannot break triangles into smaller triangles, so you end up with very strange and low poly water with weird lighting artifacts.

Which brings us to OpenGL 4.0, which leaves Geometry Shaders in their place, but adds Tessellation and Compute shaders. Tessellation is the solution to the shortcomings of the Geometry Shader – it does allow for the subdivision of vertex data into smaller primitives and it unlocks some remarkable optimizations where low vertex models can be loaded into the GPU and turned into high polygon visualizations, or things like a smooth water surface can have waves roll across them.

I was reading right along through all of these rendering pipeline changes and everything was pretty obvious – vertex shader manipulates vertex data, fragment shader manipulates fragments of pixels being drawn, geometry shader badly manipulates vertex data, tessellation shader does really cool stuff with vertex data. But. Compute Shaders. These are just. Anything. A google search for examples turns up some amazing results, including this beauty: Computer Shaders are Nifty

Maybe in a future pots I will explore compute shaders…

Updating my render loop

The OpenGL library defaults to version 1.0, with the glBegin/glEnd style immediate mode rendering. In order to use newer features you have to use an Extension Loading Library. There are a few to choose from, and a lot of tutorials will use GLEW or GL3W, but I am going to use GLAD because it supports OpenGL, and OpenGL ES, has a powerful import generator, and is easily installed using Conan.

conanfile.py

    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
        "fmt/8.0.0",
        "glog/0.5.0",
        "glad/0.1.34",
        "glfw/3.3.4",
    ];

In order to use GLAD to create a new OpenGL context with access to the new features and rendering pipeline, you have to explicitly tell glfw which version of OpenGL you want to use. I’d like to make my code run on Windows, Mac, and Linux – but Mac only supports up to OpenGL 4.1 (it has deprecated OpenGL in favor of it’s own 3d library, Metal). I mentioned wanting to support OpenGL ES, which is an Embedded System (ie mobile type device) version of OpenGL and therefore has more limited hardware and does not support all the features of a full OpenGL system. Trying to figure out the intersection of both features is complex, and I am not going to worry about a full cross-platform solution at the moment.

The new minimal code to get a reasonably modern rendering pipeline using OpenGL 4.1 looks like this:

main.cpp

#include <fmt/core.h>
#include <glog/logging.h>

#include <glad/glad.h>
#include <GLFW/glfw3.h>

int main(int argc, const char *argv[]) {
    google::InitGoogleLogging(argv[0]);
    FLAGS_logtostderr = 1;
    LOG(INFO) << "Starting";

    if (!glfwInit())
    {
        LOG(ERROR) << "glfwInit::error";
        return -1;
    }

    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 4);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 1);

    GLFWwindow* window = glfwCreateWindow(640, 480, argv[0], NULL, NULL);
    if (!window)
    {
        LOG(ERROR) << "glfwCreateWindow::error";
        return -1;
    }

    glfwMakeContextCurrent(window);

    gladLoadGL();

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // reset the view
        glClear(GL_COLOR_BUFFER_BIT);
 
        // draw something

        // display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

I’ve added the glad library import, used glfwWindowHint to specify the version to use, and ran gladLoadGL to dynamically bind all the new OpenGL features at run time.

From here the “draw something” section should be expanded to follow the OpenGL Rendering Pipeline, which consists of the following steps:

  1. Vertex Processing
    1. Vertex Specification
    2. Apply Vertex Shaders
    3. Apply Tesselation Shaders (optional)
    4. Apply Geometry Shaders (optional)
  2. Vertex Post-Processing (optional)
    1. Transform Feedback
    2. Primitive Assembly
  3. Fragment Processing
    1. Generate Fragments
    2. Apply Fragment Shaders
  4. Fragment Post-Processing (optional)
    1. Perform operations like depth testing, stenciling, and masking

Following the breakdown above, you can see there are only two required steps: Applying vertex shaders to your geometry, and applying fragment shaders to generate output.

Define Vertex and Fragment Shaders

I will be following the rather dense OpenGL tutorial on vertex and fragment shaders and making things work with glfw, glad, and c++. Before anything can be drawn by the GPU we must define both a vertex and fragment shader.

For this basic test I’m just going to define a very basic vertex shader that takes in 2D points, and an even simpler fragment shader that only generates the color yellow.

    int  success;
    char infoLog[512];

    // create vertex shader
    const char *vertexShaderSource = R"shader(
        #version 330 core
        layout (location = 0) in vec2 in_pos;

        void main()
        {
            gl_Position = vec4(in_pos.x, in_pos.y, 0.0, 1.0);
        }
    )shader";

    unsigned int vertexShader;
    vertexShader = glCreateShader(GL_VERTEX_SHADER);

    glShaderSource(vertexShader, 1, &vertexShaderSource, NULL);
    glCompileShader(vertexShader);

    glGetShaderiv(vertexShader, GL_COMPILE_STATUS, &success);

    if(!success)
    {
        glGetShaderInfoLog(vertexShader, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::VERTEX::COMPILATION_FAILED\n" << infoLog;
        return -1;
    }

    // create fragment shader
     const char *fragmentShaderSource = R"shader(
        #version 330 core
        out vec4 FragColor;

        void main()
        {
            FragColor = vec4(1.0f, 1.0f, 0.0f, 1.0f);
        }
    )shader";

    unsigned int fragmentShader;
    fragmentShader = glCreateShader(GL_FRAGMENT_SHADER);

    glShaderSource(fragmentShader, 1, &fragmentShaderSource, NULL);
    glCompileShader(fragmentShader);

    glGetShaderiv(fragmentShader, GL_COMPILE_STATUS, &success);

    if(!success)
    {
        glGetShaderInfoLog(fragmentShader, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::FRAGMENT::COMPILATION_FAILED\n" << infoLog;
        return -1;
    }

Once the shaders are compiled, we need to tell OpenGL to create a “program” that runs the two shaders.

    // create shader program
    unsigned int shaderProgram;
    shaderProgram = glCreateProgram();

    glAttachShader(shaderProgram, vertexShader);
    glAttachShader(shaderProgram, fragmentShader);
    glLinkProgram(shaderProgram);

    glGetProgramiv(shaderProgram, GL_LINK_STATUS, &success);
    if(!success) {
        glGetProgramInfoLog(shaderProgram, 512, NULL, infoLog);
        LOG(ERROR) << "SHADER::PROGRAM::LINKING_FAILED\n" << infoLog;
        return -1;
    }

    glUseProgram(shaderProgram);

Vertex Specification

Modern OpenGL versions require you to define a vertex array object and then load your vertex data into a vertex buffer object.

       // define vertex data
    float vertices[][2] = {
        { 0.0f,  0.5f },
        {-0.5f, -0.5f },
        { 0.5f, -0.5f }
    };

    GLuint vao;
    glGenVertexArrays(1, &vao);
    glBindVertexArray(vao);
    
    GLuint vbo;
    glGenBuffers(1, &vbo);

    glBindBuffer(GL_ARRAY_BUFFER, vbo);
    glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
    glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
    glEnableVertexAttribArray(0);

Render Loop

Once we have defined our vertex and fragment shaders, created a program to run them, and loaded some data, our render loop now looks like this:

    LOG(INFO) << "RENDERLOOP::BEGIN";

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);
    
    glUseProgram(shaderProgram);

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // reset the view
        glClear(GL_COLOR_BUFFER_BIT);
 
        // draw something
        glDrawArrays(GL_TRIANGLES, 0, 3);

        // display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

    LOG(INFO) << "RENDERLOOP::END";

And, if your mother told you to clean up after yourself:

    ...

    glUseProgram(0);
    glDisableVertexAttribArray(0);
    glDetachShader(shaderProgram, vertexShader);
    glDetachShader(shaderProgram, fragmentShader);
    glDeleteProgram(shaderProgram);
    glDeleteShader(vertexShader);
    glDeleteShader(fragmentShader);
    glDeleteBuffers(1, &vbo);
    glDeleteVertexArrays(1, &vao);

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

Conclusion

That is certainly a LOT more code that the glBegin/glEnd block.

The main reason for switching to a newer version of OpenGL would be to take advantage of higher throughput operations using vertex array objects and their associated calls like glDrawArrays, as well as tapping into the full programmability of shaders.

In my next post I will expand on this basic rendering loop to display a full 3D cube to give us a better set of geometry to test out shaders with, and begin to add more features like keyboard/mouse control, better viewport management, and break the program down into functions.

OpenGL Quick Start

In my previous post, Game Engine Adventures, I talked about why I would even attempt to write my own game engine (for fun!) and how I set up my environment.

In this post I am going to start with the OpenGL equivalent of Hello World, a window with an OpenGL context that displays a simple image.

Package Requirements

I’ve added additional packages to my conanfile: glog (for nice python-like logging for C++), and glfw for easy window creation and input management.

conanfile.py

    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
        "glog/0.5.0",
        "glfw/3.3.4",
    ]

Minimal glfw loop

The glfw documentation claims it “Gives you a window and OpenGL context with just two function calls“. This is a pretty bold claim, and while technically it may be true, it is also a completely non-functional application that creates a window and simply exits.

The real minimal glfw loop looks like this:

main.cpp

#include <glog/logging.h>
#include <GLFW/glfw3.h>

int main(int argc, const char *argv[]) {
    google::InitGoogleLogging(argv[0]);
    FLAGS_logtostderr = 1;
    LOG(INFO) << "Starting";

    if (!glfwInit())
    {
        LOG(ERROR) << "glfwInit::error";
        return -1;
    }

    GLFWwindow* window = glfwCreateWindow(640, 480, argv[0], NULL, NULL);
    if (!window)
    {
        LOG(ERROR) << "glfwCreateWindow::error";
        return -1;
    }

    glfwMakeContextCurrent(window);

    glViewport(0, 0, 640, 480);
    glClearColor(0, 0, 0, 0);

    while (!glfwWindowShouldClose(window))
    {
        // Clear the view
        glClear(GL_COLOR_BUFFER_BIT);

        // Render something
        // TODO...

        // Display output
        glfwSwapBuffers(window);
        glfwPollEvents();
    }

    glfwDestroyWindow(window);
    glfwTerminate();

    LOG(INFO) << "Exiting";
    return 0;
}

This will initialize glog to write to the console, iniatlize glfw and create a window (the “two line” claim?), setup the OpenGL context to render to, begin a loop waiting for the window to close, and cleanup glfw before exiting.

If you run this code it should open a plain black window that can be rendered to. It’s not very sexy.

Measuring Greatness

That’s great to sanity check, but not very interesting. Let’s add a Frames Per Second counter in the title bar. I happen to really like Python fstrings, but I’m not ready to jump into c++ 20 to take advantage of std::format. The good news is that std::format is based on fmtlib, so I’m just going to add that to my conanfile.py

conanfile.py

    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
        "fmt/8.0.0",
        "glog/0.5.0",
        "glfw/3.3.4",
    ]

Now we can update our main render loop to measure how many frames we can render in 1 second. The code here uses glfwGetTime as a more portable clock mechanism, counts how many frames have been rendered, and if the elapsed time is greater than 1 second it updates the titlebar using fmt::format. The obscure “{:>5.0f}” string is fmtlib’s syntax for padding the string with some leading spaces so the fps counter does move all over the place

Did I just pull in a library just to avoid sprintf or something equivalent? Yes and no. Check out the fmtlib benchmarks and you will see it outperforms many libraries, and it has great formatting options that we can take advantage of later.

    double lastTime = glfwGetTime();
    int nbFrames = 0;

    while (!glfwWindowShouldClose(window))
    {
        // Clear the view
        glClear(GL_COLOR_BUFFER_BIT);

        // Render something
        // TODO...

        // Display output
        glfwSwapBuffers(window);
        glfwPollEvents();

        // Calculate FPS
        double currentTime = glfwGetTime();
        nbFrames++;

        if ( currentTime - lastTime >= 1.0 ) {
            glfwSetWindowTitle(window, fmt::format("{:>5.0f} fps", nbFrames/(currentTime - lastTime)).c_str());
            nbFrames = 0;
            lastTime = currentTime;
        }
    }

Old School Cool

Before diving into new OpenGL features like Vertex Buffer Objects, Vertex Shaders, Mesh Shaders, and general rendering pipeline complexity, let’s just sanity check ourselves by replacing the TODO block with a very basic and old school glBegin/glEnd block using glVertex2f immediate mode calls.

        // Draw a triangle
        glBegin(GL_TRIANGLES);
            glColor3f(1.0, 0.0, 0.0);
            glVertex2f(0, 0.5);
            glColor3f(0.0, 1.0, 0.0);
            glVertex2f(-0.5, -0.5);
            glColor3f(0.0, 0.0, 1.0);
            glVertex2f(0.5, -0.5);
        glEnd();

Conclusion

This is very far from a game. It literally outputs a triangle in the least efficient way possible. In my next post I will begin to explore the capabilities of new OpenGL versions and try to benchmark some of the different rendering methods.

Next Post: Modern Rendering Pipelines

Game Engine Adventures

I don’t recommend re-inventing the wheel. So why would I develop my own game engine instead of using one of the many available ones? Because I was working on an engine 20 years ago, before all these shiny new engines were freely available, and it has always been a passion of mine. And because I am not actually planning on making a game engine, but rather I want to learn what is new and different in the world of C++ development, and figured I’d learn it within the context of implementing a game engine.

Package Management

One thing that I have always found frustrating when developing C++ applications is package management. Fortunately there are a couple of modern solutions for C++ that work like npm, pip, or other package managers. The two most popular that I saw were vcpkg and conan.

vcpkg

I briefly looked at vcpkg. It does seem to support Windows, Mac, and Linux. But it is a Microsoft solution, so there is distaste for tying myself to it. And the number of packages it supports appears to be far less than conan, therefore I didn’t actually do any testing with it purely based on personal preferences.

conan.io

This looks like a promising solution. It supports multiple package repositories so you can add tens of thousands of different packages to your system, if you trust the sources. By default conan.io hosts packages on a JFrog Artifactory backend, which I currently use at work, and have a high level of confidence in. The packages I am interested in using for development seem to be available by default, without needing to add additional repositories. And, it supports a simple conanfile.txt configuration, or a more powerful conanfile.py configuration as code solution. I use python daily, so this is quite appealing to me.

A quick configuration file looks like this:

conanfile.py

from conans import ConanFile, CMake

class Dread(ConanFile):
    settings = "os", "compiler", "build_type", "arch"
    requires = [
        "msys2/20210105",
        "cmake/3.20.4",
    ]
    generators = [
        "virtualenv",
        "cmake",
        "gcc",
        "txt"
    ]
    default_options = {}

    def build(self):
        cmake = CMake(self)
        cmake.configure()
        cmake.build()

My initial requires section installs msys2 and cmake on my Windows laptop, which I normally just use for gaming since I prefer to do my daily development on a Mac. Eventually I would like to explore cross-compilation, but don’t want to slow myself down by doing too much at once.

The generators section tells conan what kind of outputs to generate using the requires (and other) sections, and the build function tells conan how to build using cmake.

With that conanfile.py in place I can simply install the dependencies and build my code with like this:

cd build
conan install ..
conan build ..

I think the .. path style is an odd choice on the part of conan, but it is cleaner looking that the alternative “conan install . –install-folder build”.

Build Tools

Before I selected CMake I was looking at a few options. The top choices in my mind were: CMake, SCons, and Bazel.

SCons

This was originally brought to my attention by a couple of projects that I was familiar with and thought about giving it a try. After spending some time familiarizing myself with it, I became more and more aware of the lack of support behind it. First of all, it was more popular a few years ago and has been in a steady decline since 2017. The second factor against it was, as I was looking at other tools, the general support for SCons was “nobody uses it so we haven’t really tested this” and comments from other blogs saying “if you are supporting legacy systems then use it, if you are starting a new project then don’t use it”. So I moved on to checking out other tools.

Bazel

I thought it might be nice to check out Bazel because, well, Google. But, again, looking at trends, it appears there has been a HUGE drop off in Bazel popularity. I don’t know if this is purely pandemic related, or if there are other factors at play, but it looks like its popularity has died off in the last year. Combine that with the fact that it is written in Java and I’m not really interested in adding Java requirements to this project, I moved on from Bazel.

CMake

I made a bit of a long circularly loop to get back to CMake. But it really came down using something with minimal overhead, wider support, and introducing the least extra tooling and complexity. CMake is at least somewhat similar to traditional Makefiles (more so than SCons or Bazel because it generates Makefiles), and getting started using it with conan is well supported and very straight forward.

CMakeLists.txt

cmake_minimum_required(VERSION 3.20.4)
project(Dread)

include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake)
conan_basic_setup(NO_OUTPUT_DIRS)

link_libraries(
    ${CONAN_LIBS}
)

add_executable(Dread src/main.cpp)

The conan cmake generator creates a conanbuildinfo.cmake file that contains ALL your header and library definitions for the packages you’ve installed with conan. This makes it very easy to include them in your CMakeLists.txt file and quickly have a working build.

Conclusion

With package management and a build system in place I was able to easily compile my main.cpp source file and run a simple SDL demo to prove my development environment worked.

In my next post I will cover how I have set up my IDE, project directory structure, and the initial entry point for the engine.

Next Post: OpenGL Quick Start

Workaround for gmail error “downloading this attachment is disabled”

Workaround for gmail error “downloading this attachment is disabled”

I wanted to download an attachment that was sent to me years ago, but now gmail is blocking the download of files with attachments it doesn’t like and gives the user no way to download it. Their helpful advice was to ask the original sender to upload it to Google Drive. Not helpful if the original sender doesn’t have the file because it is a decade old.

Workaround

Gmail has your data right there, it just thinks it knows better than you. Maybe this helps protect some people. But I just want my file.

 

The first thing to do is view the original source by clicking on the three dots at the top right of the email

The click the Download Original link and open the file in a text editor. At the top of the file you will find the message header and body, and just after that you will find the attachments are encoded as text using base64 encoding. Each attachment will look like a block like this:

--bcaec554d754b0f76a04d9fda578--
--bcaec554d754b0f77204d9fda57a
Content-Type: application/pdf; name="test.zip"
Content-Disposition: attachment; filename="test.zip"
Content-Transfer-Encoding: base64
X-Attachment-Id: 123456789_0.1

d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzPyB3aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz93aHkgYXJlIHlvdSB0cnlpbmcgdG8gZGVjb2RlIHRoaXM/d2h5IGFyZSB5b3UgdHJ5aW5nIHRvIGRlY29kZSB0aGlzP3doeSBhcmUgeW91IHRyeWluZyB0byBkZWNvZGUgdGhpcz8K

--bcaec554d754b0f76a04d9fda578--

Either copy and past the content between the X-Attachment and before the end of record into a new file, or for very large attachments delete everything except the encoded attachment.

On a Mac/Unix environment you can use the base64 program to decode the attachment

 

cat encoded.txt | base64 --decode > test.zip
unzip test.zip

 

0

MySQL Character Encoding

I ran into some issues with a project where I was developing on one computer, pushing changes to a test server, while the production data was on another server, with a lot of different platforms and different versions of software. When I would do a backup from production to dev the character encoding was coming out wrong.

What I discovered was that running mysqldump and redirecting the output to a file can result in the terminal’s character encoding reinterpreting the output, and that the dump file from one version/platform of MySQL was not creating the new database with same character encoding.

My fix was to do the following:

mysqldump -u username -p -c -e –default-character-set=utf8 –single-transaction –skip-set-charset –add-drop-database -B database -r dump.sql

Then run the following:

sed -e ‘s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/g; s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/g’ -i.bak dumpfile.sql

After the export and conversion you can run

mysql -u username -p –default-character-set=utf8 database
mysql> SOURCE dumpfile.sql

0

Migrating Data to a Synology DS1812+

Migrating data from one storage platform to another can be slow and tedious, if you just plug it into the network and hope for the best.

For example, my customer’s network runs at 10/100 Mbps. So, when we plugged in the new DS1812+ to the network we achieved an underwhelming 7MB per second on average. This would have taken over 4 days to transfer 4TB.

To see what is going on, first look at the network speed. A 100Mbps network has about 10% overhead just to send packet headers. After converting BITS to BYTES, you can achieve about 11MB per second. The files we are transferring vary in size, and every start and stop of a file transfer creates a pause in data being transferred, which reduces the resulting transfer rates.

Additionally, the tool you use to transfer files has overhead. In the default cause of using rsync, it uses SSH to transfer files, which means every file has to be encrypted and decrypted. This also reduces the transfer rate because now the processor on both ends needs to do additional work.

To speed things up, we performed a few tricks. First we directly connected the server from a secondary NIC to the DS1812+ secondary NIC, which allowed both devices to connect at 1Gbps speed, which translates in a theoretical max of 100+MB per second. Then we configured both the Synology and DS1812+ to use an Maximum Transmission Unity (MTU) of 9000 BYTES. Then we told rsync to use the rsync daemon (rsyncd) instead of SSH so we could avoid the encryption overhead. Since these devices are directly connected in our datacenter there is no real risk of exposure of data.

Here is how to set the MTU on an OSX server:

sudo networksetup -setMTU en1 9000

The resulting transfer rates now peak out at 50+MB per second. Again, there is overhead for rsync to start transferring each file, and if the file is less than 50MB, the whole file will transfer in less than a second so you will see a lot of transfers being 10MB per second, or 30MB per second, etc, depending on file size.

Here is the command we used:

rsync –recursive –copy-links –times –verbose –progress /Volumes/source rsync://user@192.168.1.1/target

0

linux dependency hell

One of the old servers I discovered in a forgotten office was running Debian 4. We wanted to do a physical to virtual (P2V) migration so it was no longer running on the old hardware, which was about 8-10 years old. Unfortunately, this old box was not running SSH, and, as seems to happen with “things that have been forgotten”, nothing “just works”.

In order to run VMWare Converter you need to have ssh access. But, sshd was not running on the box, and it appeared the binaries were missing.

I tried to run aptitude install openssh-server and found there was a dependency problem where libc6-dev had been updated to 2.7-18lenny7, but libc6 was still using 2.7-18lenny4. All attempts to update libc6 were met with errors finding programs like locale, or ldconfig, or /etc/init.d/glibc.sh. The /etc/apt/sources.list was so old the mirror no longer existed, so I looked up Debian’s archives and changed it to http://archive.debian.org/debian-archive/debian and did an aptitude clean and aptitude update.

At this point I could actually download packages again, but upgrading still failed. After trying to clear aptitude’s cache and trying again, it still failed. So, I ran aptitude download libc6, and then ran dpkg-deb -x libc6*.deb libc6-unpacked

I then copied the ldconfig and glibc.sh programs from the extracted folder and put them back on the system where they were supposed to be. Then I ran dpkg -i libc6_2.7-18lenny8_amd64.deb, which successfully installed and allowed me to run aptitude upgrade to bring the whole box up to date and run aptitude install openssh-server.

Great, back to VMWare Converter. Enter the IP, name, and password… and error: Unable to query live Linux source. I tested out connecting to the box with an ssh client and was greeted with “Permission denied” as soon as I connected. Looking at the sshd_config revealed there it had no “PasswordAuthentication yes” line, so I added that and did service sshd restart. Now the VMWare Converter could connect and the migration started running.

The next problem was the import failed. Looking at the box start up it could not find the root partition on /dev/hda1. VMWare 5.0 uses LSI Logic SATA drives, so it was clear the old kernel was compiled without the correct drivers. Back to the old box, download the linux src, extract it, make menuconfig, I went with most of the defaults but added Executable Emulation for 32bit binaries on an amd64 core. Did a make, make-modules_install install. The old box was using lilo, but someone had tried to install grub, so I finished the config file and had it point to the old kernel with a boot option for the new kernel. Ran grub-install, rebooted, then ran the converter again.

The new kernel didn’t have the right NIC drivers, so I let it boot into the old kernel. It failed at the same point during the conversion, but this time I just booted it myself and selected my new kernel and both the LSI Logic and VMXNET3 network cards worked, and the services all started up.

0

VMware 2.0 to 5.0 Migration

The things you find in old closets. Sometimes they might be better left in the closet, hidden from view, but when it is an old server and I’m trying to secure your network, it has to be dragged into light and exorcised.

One of my favorite discoveries has been an old 2008 server (I was worried is was going to be Windows NT!) that was running VMWare Server 2.0. Now, I’ve been doing IT for 20 years, but I had never actually seen VMWare Server 2.0 before. So this was quite an exciting discover. I felt like an archaeologist unearthing an ancient Roman artifact.

After the initial laughter and sending screenshots to everyone I know I decided to migrate the one VM (a Debian 4 distro) that was running on the server to the production environment so it could be backed up and decommissioned properly. But, the big question was, would I be able to successfully migrate it from VMWare 2.0 to VMWare 5.0?

Since you can’t convert an VM that is running, and nobody had the password for the old VM, I just powered it off. Then I loaded up VMWare Converter, told it to convert an “other” image type, and pointed it at the old-vm-servere$ and browsed to the vmdk file. It took an hour to migrate it and convert it to an ESX 5.0 host with hardware level 8. I went ahead and added a VMXNET3 network card instead of the old VMWare 2.0 “Flexible” network card. Then I powered the guest on and rooted the password (edit startup command and add init=/bin/bash, then run mount -rw -o remount /, change the root password, and reboot). Once I logged in with my new root password I modified /etc/network/interfaces to use the new network card and restarted the server again just to make sure everything worked. And it did!

Needless to say, I am very impressed that VMWare has made it so easy to migrate from a 2.0 guest to their latest 5.0 environment. So often big companies will leave no migration paths. This just shows that VMWare is a good company with a great product!

 

0

OSX Firewalls – a dismal experience

I’m spoiled on unix firewalls extreme flexibility, and paradoxically, Windows firewall ease of configuration.

There should be a good middle ground in there. Mac does a great job of “being” unix, but with a much easier interface than Windows. Which is a feat. But, let me just put on my rant hat and rant pants. WHAT THE HELL IS WRONG WITH THE OSX FIREWALL!?!?

Why would you move from ipfilters to the more featureful PF firewall that the unix environment offers, and then only provide a brain dead interface that allows you to select Applications to allow through the firewall, and ZERO ability to limit the networks or IPs that are allowed to use those applications?

What kind of security is provided by either allowing a) the entire world to access Screen Sharing, or b) nobody…

Yes, you can make an argument that the corporate firewall, or even your home router, should be acting as hardware firewall to protect you. But when I go to Starbucks, who is protecting me there? When I’m in the airport, who is protecting me? Nobody is. Thanks Apple.

Microsoft gets it right in this department. And, as far as I am concerned, Apple doesn’t even actually offer a useable firewall. At least not out of the box.

 

Here is my solution: PFLists by Hany El Imam

 

This handy little app allows you to specify which networks or IP addresses are allowed to connect to which ports on your computer.

The only thing missing is Microsoft’s concept of “network location” so I can be more open at home and more secure at Starbucks.

0

Bulk Password Testing

Client has a ton of unix hosts, and they all have different passwords, and are not well-documented, and we need to secure them. Not wanting to root all of them or trying to type in a list of different possible passwords and accounts to try, you can use ncrack in an automated way to scan a network and test username and password combinations.

Install ncrack

apt-get install build-essential checkinstall libssl-dev  libssh-dev
wget http://nmap.org/ncrack/dist/ncrack-0.4ALPHA.tar.gz
tar xvfz ncrack-0.4ALPHA.tar.gz
cd ncrack-0.4ALPHA/

./configure
make
sudo checkinstall
sudo dpkg -i ncrack_0.4ALPHA-1_amd64.deb

Create a password list

For my purposes we had a list of passwords we could try. If you don’t have enough information to create a reasonable password list, you can grab a list of 500 passwords from skullsecurity.org.

wget http://downloads.skullsecurity.org/passwords/500-worst-passwords.txt

Run ncrack

Note that you can specify multiple user accounts to try as a comma separate list.

(Oh, and this is just sample output and not from one of our servers.)

ncrack -p 22 –user root -P 500-worst-passwords.txt 192.168.1.0/24

## sample output ##

Starting Ncrack 0.4ALPHA ( http://ncrack.org ) at 2011-05-05 16:50 EST
Stats: 0:00:18 elapsed; 0 services completed (1 total)
Rate: 0.09; Found: 0; About 6.80% done; ETC: 16:54 (0:04:07 remaining)
Stats: 0:01:46 elapsed; 0 services completed (1 total)
Rate: 3.77; Found: 0; About 78.40% done; ETC: 16:52 (0:00:29 remaining)

Discovered credentials for ssh on 192.168.1.10 22/tcp:
192.168.1.10 22/tcp ssh: ‘root’ ‘toor’

Ncrack done: 1 service scanned in 138.03 seconds.

Ncrack finished.