Using gperf with C++

16. June 2012 00:02

 

Did you know that you can use gperf with C++ to generate hash tables which can be used to make code easyier to read and also run faster. This is a short guide on how to do this for something that is quite a common problem.

 

On of these simple problems comes down to code clarity and performance. If you have ever seen large functions that spam 1000's or line of if or case statements then you will know what the problem is already. Though typically this problem is formed in client / server driven software that is processing commands on the server. It is caused because the server needs to figure out what function to run from an incomming command from the client. Quite often this end's up in code something like this.

 

if (command == "ONE") { DoOne(); }
if (command == "TWO") { DoTwo(); }
if (command == "THREE") { DoThree(); }

 

 

Typically in a large client / server project the above can grow to 1000's of commands. Quite often new commands are always added to the end of the list. The obvious problem here is that for the 1000'th command it has to evaluate 1000 if statements. This obviously isn't a great solution.

 

So here is a better way to do it. We can start with some example code that might exist on the server. eg The functions that are being run as the incoming commands are being processed. This is of course a simple example.

 

 

class Test {
public:
    static void One() { printf("One\n"); }
    static void Two() { printf("Two\n"); }
    static void Three() { printf("Three\n"); }
};

 

We also need to create a method of calling this so that the functions above can be called at design time. We can do that by creating a structure with a name (eg the command) and a pointer to the function.

 


struct TType {
    const char *name;
    void (*func) (void);
};

 

The next idea is to create a function that can lookup the TType structure in a dataset by the incoming command name. This is where gperf comes in and we can create a gperf file which will end up looking something like the following. There is a working example of a gperf file at the end of this post. Something that is also worth pointing out that in the data list between the %% and %% lines you are not permitted to use spaces as these will be considered empty strings. After all they are valid data. However you can add spacing / comments by prefixing with the #

 

 

%ignore-case
%language=C++
%define lookup-function-name Lookup
%define class-name Functions
struct TType
%%
####
ONE,    Test::One
TWO,    Test::Two
THREE,  Test::Three
####
%%

 

The above will tell gperf to ignore case when doing the string matching and it will specifiy the output to be C++. It will also create a static C++ class and function in the generate output file named Functions::Lookup along with a static structure of the TType which also contains the entire data list. As an example the C++ code that is generated is as followed. However it does also create a number of other items releated to the gperf hash calculate that is performed during the lookup. I have only included a small chunk of the file below as the rest of it really isn't human readable.

 

 

static const struct TType wordlist[] =
  {
    {""}, {""}, {""},
#line 31 "gperf-exmaple.gperf"
    {"TWO",    Test::Two},
#line 30 "gperf-exmaple.gperf"
    {"ONE",    Test::One},
#line 32 "gperf-exmaple.gperf"
    {"THREE",  Test::Three}
  };

const struct TType *
Functions::Lookup (register const char *str, register unsigned int len)
{
  if (len <= MAX_WORD_LENGTH && len >= MIN_WORD_LENGTH)
    {
      register int key = hash (str, len);

      if (key <= MAX_HASH_VALUE && key >= 0)
        {
          register const char *s = wordlist[key].name;

          if ((((unsigned char)*str ^ (unsigned char)*s) & ~32) == 0 && !gperf_case_strcmp (str, s))
            return &wordlist[key];
        }
    }
  return 0;
}
#line 34 "gperf-exmaple.gperf"

 

The next step in getting this to work is to make the program call the lookup function. This is normally done by creating the above file from the gperf file by running the gperf command like gperf -tCG gperf-example.gperf > gperf-example.h and then including the file into the C++ code where the lookup function will be called from. As an example you end up with a program that looks like this.

 

 

#include "gperf-example.h"


int main(int argc, char **argv) {
    const TType *tmp = Functions::Lookup("One", 3);

    if (tmp == NULL) {
        printf("FAILED\n");
    } else {
        tmp->func();
    }
}

 

The above is obviously a little easyier to maintain in the long run and runs a lot faster that trying to process 100's of if statements. It can also be integrated with the build system so that the file can be produced automatically when updates are made.

 

Here is a complete runnable example of a gperf configuration. I put this together to show how to the C++ code can be mixed into the gperf file.

 

 

%{

/* gperf -tCG gperf-exmaple.gperf > myfile.cpp */

#include <stdio.h>
#include <string.h>

struct TType {
    const char *name;
    void (*func) (void);
};


class Test {
public:
    static void One() { printf("One\n"); }
    static void Two() { printf("Two\n"); }
    static void Three() { printf("Three\n"); }
};

%}

%ignore-case
%language=C++
%define lookup-function-name Lookup
%define class-name Functions
struct TType
%%
####
ONE,    Test::One
TWO,    Test::Two
THREE,  Test::Three
####
%%


int main(int argc, char **argv) {
    const TType *tmp = Functions::Lookup("One", 3);

    if (tmp == NULL) {
        printf("FAILED\n");
    } else {
        tmp->func();
    }
}

 

The above can be processed, compile and run with the following commands.

 

 

gperf -tCG gperf-example.gperf > gperf-example.cpp
g++ -Wall gperf-example.cpp -o gperf-example
E-mail Kick it! DZone it! del.icio.us Permalink


Using gdb to debug a core file

5. April 2012 08:00

 

The gnu debugger (gdb) is probably the best tool for looking into core files. It also isn't overly complex to use to get some basic starting information. So this is a quick guide to getting some debug information eg variable's and stack traces from a core dump which is formed when an application crashes in linux.

 

If an application crashes and doesn't produce a core file it is probably because of the limit settings you can check and enable core dumps by using the following "ulimit -c" if it outputs a 0 it will not produce a core. You can use "ulimit -c unlimited" to make the core dump file size unlimited. Be aware though that if you have a lot of crashes it can use a significant amount of disk space.

 

Let start with finding out what made a core file in the first place. In order to debug it at all you need to know exactly what program crashed. You can determin this using the following command.

 

cat core |strings |grep -E '^_='
_=./willcore.exe

 

In this case the "./willcore.exe" made the core dump. Another way to find out what core'd is to use gdb. The example is below

 

gdb --core core

Core was generated by `./willcore.exe'.
Program terminated with signal 11, Segmentation fault.
#0  0x080483d4 in ?? ()
(gdb)

 

Something to notice at this point is that is shows where the last execution point was. In this case it was running at memory location 0x080483d4. However since there are no debugging symbols loaded in gdb yet it shows ?? because it cannot translate the raw address to a function.

 

You can get gdb to load the executable and debugging symbols (assuming they are compiled into the executable) using the "file" command. I have also added the "bt" command to produce a back trace so show the execution stack.

 

 

(gdb) file ./willcore.exe
Reading symbols from /home/james/CVS-Root/linux/misc/willcore.exe...done.
(gdb) bt
#0  0x080483d4 in main (argc=1, argv=0xbf877ef4) at willcore.c:8

 

In the example above the symbols now resolve and will show a lot more information. If they do not show after loading the executable into gdb it will be a problem with debugging symbols. You will need to go and compile the program with debugging switched on eg the "-ggdb" flag in gcc and g++.

 

Now that things are loaded you can move between stack frame's using "frame <number>" where number is the part beside the # in the stack output and you can also print and inspect other parts of memory as well as list the source code from the program assuming the source still exists in the original location that it was compiled from.

 

 

(gdb) p argc
$1 = 1
(gdb) p argv[0]
$2 = 0xbf87988c "./willcore.exe"
(gdb) p argv[1]
$3 = 0x0
(gdb) list
1
2
3       #include <stdio.h>
4
5       int main(int argc, char **argv) {
6               char *tmp = 0;
7
8               *tmp = '0';
9
10
(gdb) p tmp
$4 = 0x0

 

Note: the above example shows how to produce a core file by dereferencing a null pointer in c which in its self can be useful to fore a core dump which is what I used for this tutorial.

 

As a final example the following will dump all the stacks from all threads that were running in a process at the time it crashed. Though in the sample program there only is one thread. In a large application this could provide several pages of output.

 

 

(gdb) thread apply all bt

Thread 1 (Thread 12519):
#0  0x080483d4 in main (argc=1, argv=0xbf877ef4) at willcore.c:8
(gdb)

 

E-mail Kick it! DZone it! del.icio.us Permalink