15 januari 2012

Measuring graphics performance

If you want to measure the render time, it doesn't work very well with a standard OS timer function. The reason for this is that OpenGL will do some of the work in the background, which means your timer function can return a value close to zero. There is support in OpenGL 3.3 to request the actual render time, using queries. This is done in a couple of steps:
  1. Request OpenGL to begin the query.
  2. Do the Draw operation
  3. Request OpenGL to stop the query.
  4. Read out the result of the query.
The problem is that you obviously can't read the result until the drawing is done. And as already mentioned, the actual drawing may be done in the background and still not be complete when you ask for the result. That means that OpenGL will have to wait until the drawing is actually done, and then return the result. This can severely degrade the performance. It could be okay if you only do this during development, but it will screw up the timing of other functions, and be less helpful.

The result of a query is available until you start the next query on the same query ID. As long as the result isn't requested too early, the pipeline will not be disturbed. The trick I am using is to read out the result the frame after, instead of in the current frame. The draw back is that the result will be one frame old, which is not a problem for doing statistics. That is why, in the pseudo code below, I read out the result first, and then request a new query to be set up.

GLuint queries[3]; // The unique query id
GLuint queryResults[3]; // Save the time, in nanoseconds

void Init() {
    glGenQueries(3, queries);
}

The main loop is as follows:

bool firstFrame = true;
while(1) {
    if (!firstFrame)
        glGetQueryObjectuiv(queries[0], GL_QUERY_RESULT, &queryResults[0]);
    glBeginQuery(GL_TIME_ELAPSED, queries[0]);
    DrawTerrain();
    glEndQuery(GL_TIME_ELAPSED);

    if (!firstFrame)
        glGetQueryObjectuiv(queries[1], GL_QUERY_RESULT, &queryResults[1]);
    glBeginQuery(GL_TIME_ELAPSED, queries[1]);
    DrawTransparents();
    glEndQuery(GL_TIME_ELAPSED);

    if (!firstFrame)
        glGetQueryObjectuiv(queries[2], GL_QUERY_RESULT, &queryResults[2]);
    glBeginQuery(GL_TIME_ELAPSED, queries[2]);
    DrawMonsters();
    glEndQuery(GL_TIME_ELAPSED);

    printf("Terrain: %.2f ms, Transparent: %.2f ms, Monsters: %2.f ms\n",
        queryResult[0]*0.000001, queryResult[1]*0.000001, queryResult[2]*0.000001);
    SwapBuffers();
    firstFrame = false;
}

C++ Implementation

For a C++ class automating the measurement, see Ephenation TimeMeasure header file and implementation file.

AMD

If you have a graphics card from AMD, there is a tool available that will give very detailed timing reports: GPUPerfAPI.

Revision history

2012-06-13 Added reference to AMD tool.
2013-02-07 Added reference to class implementation.