Object Oriented C: Single inheritance Part II

Here’s another installment on Object Oriented C programming. This time, we’re going to take a look at how inheritance allows you to simulate a generic function courtesy of the C preprocessor. If you have ever wondered how those spiffy objects in other programming languages always get “printed,” here’s one way to do it.

First, let’s write a little code:

#include < stdio .h>

typedef struct _object Object;

#ifndef OOCP_PRINTFUNC
typedef int  (*PrintFunc) (void * stream,
const char * format,
...);
#define OOCP_PRINTFUNC
#endif

#ifndef OOCP_PRINTER
typedef void (*Printer)   (Object * o,
PrintFunc print_func,
void * stream);
#define OOCP_PRINTER
#endif

#define object_print(o,p,s)   ((Object*)o)->print ((Object*)o,(PrintFunc)p,s);

struct _object {
Printer print;
};

typedef struct _thing {
Object o;
int value;
} Thing;

void
thing_printer(Object * o, PrintFunc print, void * stream) {

Thing * t = (Thing *)o;
print(stream, "Value from thing_printer: %dn", t->value);
}

int
main(int atgc, char ** argv) {

Thing t;
t.o.print = thing_printer;
t.value = 1;

object_print(&t, fprintf, stdout);

return 0;
}


Ok, some 50 lines of code. Doesn’t look like much, but if you’re new to C, and haven’t been following along, rest assured: we’re starting to flex some muscle now.

That is, there’s a lot going on in this code.

Let’s start at the top:

typedef struct _object Object;

#ifndef OOCP_PRINTFUNC
typedef int  (*PrintFunc) (void * stream,
const char * format,
...);
#define OOCP_PRINTFUNC
#endif

• Line 3 is familiar, but we would normally expect this in a header file rather than a source file. We’re keeping the typedef here to keep the listing simple.
• But what’s this typedef on Line 6? And why these ugly #defines on lines 5, 9 and 10?

Easy stuff first. The #defines protect your code from a compiler redefinition error. The idea is you want to define something in one place only, and have that thing (whatever it is) defined throughout your program. Remember, we’re dealing with strong typing, and a compiled language. Interpreted languages, Ruby or PHP for example, have little limitation on how your data may be defined.

In those languages, if you want to declare something a function in one line of code, then that same thing as an array in another line of code, that’s your business.

But not with C.

Line 6 gets into the meat. Recall we typedef’ed structs in header files to create hidden implementations. The type checker just needs the declaration to compile properly, the implementation can be anywhere within the typedef’s scope. Here, we’re using typedef to create a pointer to type of function, which is called PrintFunc.

If you think the prototype looks familiar, perhap similar to functions defined in stdio.h, you’re thinking correctly.

Let’s continue…

#ifndef OOCP_PRINTER
typedef void (*Printer)   (Object * o,
PrintFunc print_func,
void * stream);
#define OOCP_PRINTER
#endif

#define object_print(o,p,s)   ((Object*)o)->print ((Object*)o,(PrintFunc)p,s);

struct _object {
Printer print;
};


With Line 13, we’re getting a bit more esoteric. Now we’re building our typedef’ed function from our own derived types instead of C’s primitives, or from types defined in the standard library. We’ve seen this stuff before, though, even if just a few lines before in the case of PrintFunc.

Now, any function which prints and takes the same arguments as Printer can be assigned to a Printer variable. More importantly, such functions can be assigned to member variables in any struct declaring a Printer member.

It’s a macro. Not a function.

Recall in Java you have Object.tostring? This is pretty close to that.

We now have all the tools we need, so let’s put it together.

typedef struct _thing {
Object o;
int value;
} Thing;

void
thing_printer(Object * o, PrintFunc print, void * stream) {

Thing * t = (Thing *)o;
print(stream, "Value from thing_printer: %dn", t->value);
}


We’ve got our Thing (Line 25), and a way to print (Line 31) our Thing. Life is good.

Now let’s use our Thing:

int
main(int atgc, char ** argv) {

Thing t;
t.o.print = thing_printer;
t.value = 1;

object_print(&t, fprintf, stdout);

return 0;
}


And there you have it, on Line 44: print that Thing out using it’s very own print “method.”

Exercise: Instead of declaring a Thing t, declare a Thing * t. Allocate its memory, print it, then remember to deallocate the memory afterwards. Run the program through valgrind to check your memory usage. You should have some memory left on the heap at the end of runtime, but nothing left over (no leaks).

The usual caveats…

You won’t want to use object_print for everything, but you might find that that a printing function defined in the same way works in a deeper class hierarchy. Suppose you have a Matrix class, which is the child of Object, and has children representing different kinds of matrices. In this case, having a matrix_print might be just the ticket.

Remember: object oriented anything is not a silver bullet for slaying design dragons. The simple framework we’re developing here is just that, simple. Instead of trying to bend your problem into this framework, use these concepts to create a framework that works best for your project.

Here’s some related material on function pointers:

• The Function Pointer Tutorials: An excellent followup by Lars Haendel going into considerably more detail than what’s written here.
• Declaring, Assigning, and Using Function Pointers: This is a classic piece of exposition on C, written by Steve Summit. I recall Steve from way back in Usenet days, before the World Wide Web. He’s the real deal and it’s worth studying this article in detail.

Stay tuned for an article specifically on how to typedef function pointers.

Single Inheritance In C Programming Language I: Nesting structures

The C programming language lends itself well to single inheritance object-oriented schemes, provided a little care is taken in the class structure.

Consider the following definition:

typedef struct _object {
int foo;
} Object;

typedef struct _thing {
Object o;
double bar;
} Thing;


Right here, I know the double declaration has lost half of you, and lost all the systems programmers… but bear with me. My background is in scientific computation, doubles are our bread and butter.

What is this “Object” at the at the head of “_thing”?

Well, Object is the parent of Thing. Object blocks out a contiguous chunk of memory at the head of the _thing struct. This let’s you do the following:

1. Cast to Object and access the members of Object.
2. Pass any child of Object through a function definition as an Object type.

What’s it look like?

Casting to parent in function call

Let’s write a little program, using the two structures above:

void
print_foo(Object * o) {

printf("o->foo: %dn", o->foo);
}

int
main (int argc, char ** argv) {

Thing t;
t.o.foo = 1;
printf("t.o.foo: %dn", t.o.foo);
print_foo(&t.o);
print_foo((Object *)&t);
return 0;
}


Assuming this little program is called “simple_inheritance.c,” compile with

gcc -DHAVE_CONFIG_H -I. -I..  -I../include -Wall -pedantic   -g -O2 -MT simple_inheritance.o -MD -MP -MF .deps/simple_inheritance.Tpo -c -o simple_inheritance.o simple_inheritance.c


Running simple_inheritance produces

$./simple_inheritance t.o.foo: 1 o->foo: 1 o->foo: 1$


Spiffy, no?

Ok, this is semi-cool, but that obnoxious cast on line 14 spoils some of the fun.

As it turns out, that cast is necessary to keep the compiler from squawking about potential type errors, but… and this is a very important but… our handy C Preprocessor (CPP) can make that sort of go away. That is, we can write a macro to hide that cast, then invoke the macro when we to use foo.

Now if you’re like me, you’re gonna be thinking something like “Who gives a rat’s patootie about foo? Why is that useful to me?”

Remember those typedefs, how we hid the implementation of structures?

We can do the same thing with functions by typedef’ing callbacks, then using those callbacks as elements in our spiffy Object struct. Recall prissy languages like Java usually have some sort of to_string() method sitting way back up the inheritance tree, usually at it’s root, not surprisingly, called “Object.”

By now you should correctly surmise this little series of articles on Object Oriented C is going somewhere. I won’t claim to know where it’s going (and wouldn’t say if I knew), but it’s on its way somewhere.

In fact, these articles are writing themselves, so let’s just follow along and see where we end up.

Simple Object Oriented Unit Testing For C Programming

Recovery from giant blast of wind

Remember that giant blast of wind a few months ago? The wind that launched all my cool plants down the stairs? Well, here’s a picture of the re-potted collection. That’s about half of them, the other half are still languishing. Crassulae are tough as nails. I may re-pot the remaining… but these need to be split out now. They’re a little overcrowded. (Update: I did split that pot and repot the remaining. Many of them ended up at my friend Ben’s house in Berkeley.)

From left to right: kalanchoe longifolia, unknown crassula, kalanchoe tubiflora, unknown sedum, kalanchoe serrata, unknown kalanchoe, another k. longiflora, sedum burrito, kalanchoe glaucesens, kalanchoe gastonis-bonnerai (hardly visible, very sick little plant), kalanchoe tomentosa, kalanchoe pumila. There’s a kalanchoe marmorata buried in there as well, but that one volunteered. Whew!

The important stuff dealt with, let’s get on with some C programming…

Really simple unit testing for C

Several years ago I found myself needing a simple way to unit test some c code I was working on. I had previously used JUnit, which was relatively new at that time. cppunit either wasn’t released yet, or was very raw, and besides, my code was straight c.

So I wrote my own.

I knew I didn’t want the complexity of the full JUnit framework, but I did want the the convenience of an object-oriented system.

The general design of this simple object-oriented c unit testing code is to keep the test harness very simple. Each class test can be run as it’s own program, allowing invocation from a shell script, or you can stack the tests like brickwork into increasingly more general c programs.

This is in contrast to the JUnit design where you have to set up more framework in advance, and you add tests within their framework.

The JUnit design is probably more powerful in the sense that it scales better.

On the other hand, mine is simple enough that you can design your own system using it as a building block. And small enough that you can very easily embed it into your code and ship it for internal testing if you like.

I’m sure there are other ways to do it.

Design of a simple unit testing class

The design is simple enough to implement as a class in either C or C++. I’ve done both, used both in production, and will compare the two approaches here. First, an overview on the general idea:

1. In the C code, the unit test struct definition is located in the header file, making it defined for any file including the unit testing header file. It’s very simple, just a table:

/** Each unit test builds a table that can be
* traversed to exercise each function in the
* component.
*/
typedef struct _testfunc {
int (*test)(void);
const char * testname;
} TestFunc;

2. Here’s an example of a table from the demo code, which tests the correctness of a line segment intersection algorithm:

TestFunc tf[] = {
{test_vertical_cross,       "vertical_cross"       },
{test_vertical_parallel,    "vertical_parallel"    },
{test_endpoint_intersection,"endpoint_intersection"},
{test_total_overlap,        "total_overlap"        },
{test_partial_overlap,      "partial_overlap"      },
{test_no_intersection1,     "no_intersection1"     },
{NULL,                      ""                     }
};


These functions could be hashed into a red/black tree or something, but that’s too much work and too much code. It doesn’t really matter how the functions are ordered, and they’re all going to be called one after another, so a dispatch table is just fine.

3. As you will see from the source code, it’s very simple: a loop scans the table and invokes each callback in turn, until it gets to NULL. The callbacks are boolean, which you will see in the C++ code, returning pass or fail:

int
unittest(TestFunc * testfunc) {

int i = 0;
int passed = TRUE;

while (testfunc[i].test != NULL) {

if (testfunc[i].test()) {
fprintf(stdout, "Passed test_%s.n", testfunc[i].testname);
} else {
fprintf(stdout, "Failed test_%s.n", testfunc[i].testname);
passed = FALSE;
}
i++;
}

fprintf(stdout,"n");

return passed;
}


The default value in each of the test functions should be FALSE. Can you see why the default return value is TRUE in this function?

Testing for line intersection

For an example, we’ll be testing a function that determines if two line segments intersect. The code is part of a geometry component for the Discontinuous Deformation Analysis application. The idea is that we need to create geometrical blocks from collections of line segments. Line segments that cross each other need to be partitioned into pieces for use by different blocks, and segments that overlap each other need to be appropriately trimmed or deleted. There’s more to block construction than this, but these tasks are handled by the code we’re going to unit test here.

Here’s the overall plan for testing:

1. Test for intersection
2. Test for non-intersection
3. Test for partial overlapping
4. test for full overlapping

We’ll construct a set of line segments for testing each case.

Remember: unit testing cannot guarantee correctness. It can only find find situations which are specified as incorrect. The specification for correctness and incorrectness has to be determined out of band. In this case, that means drawing out a set of line segments on graph paper, picking off the endpoints, feeding them into the algorithm, and determining whether the algorithm is returning what we think it ought to be returning.

Download the Object Oriented C toolkit and take a look for yourself. The code is LPGL licensed, but truthfully, I’m not that worried about it. If you find it useful, and you can show me something cool you did with it or how you extended it, I’m happy. Alternatively, I’d be happy to add any minor tweaks you might like to use back into the main code base… as long as the extensions are simple. If you want a full-blown framework, use CGreen or Boost.

These are the best links I could find on why unit testing is important:

• JUnit.org: The original home of unit testing. You can learn a lot here.
• cgreen is an interesting tool, but much heavier than the tool outlined in this article.
• What’s with the these plants anyway?

Plants are cool. I like them.

I have plenty of spares, mostly succulents. If you’re an East Bay or San Francisco local and would like a cool plant, stop by some time get a couple. Or drop me a line and I’ll drop off a plant for you.

How To Hide A Struct Member in the C Programming Language

Update: A friend pointed out a typo, which is now fixed in both the article and the code.

Hiding a struct member in C is easy using incomplete types. There’s two ways to do it:

1. Hide the whole struct definition
2. Hide a single member of a struct

Either way requires using a typedef to define an incomplete type, where the definition of the type is separate from the declaration of the type.

If you do any coding in the C programming language at all, learning this simple technique will provide you with many benefits:

1. Your programs will become cleaner as you learn the techniques of encapsulation and data hiding
2. You will find other C source code easier to read
3. Lastly, you will much better understand how object-oriented languages such as Java and Ruby work.
WARNING: Extensively employing incomplete types requires that you adopt bulletproof memory management strategies.

Method 1: Hide the whole struct

Hiding the whole struct is enabled declaring a typedef in a header file and defining the struct in the source file. Here’s the source for foo.h:

#ifndef IS_FOO_H
#define IS_FOO_H

typedef struct _foo Foo;

#endif /* IS_FOO_H */


We define the struct in foo.c:

#include <stdio .h>
#include "foo.h"

struct _foo {
int bar;
int baz;
};

int
main(int argc, char ** argv) {

Foo foo;
foo.bar = 1;
foo.baz = 2;
fprintf(stdout,"Foo.bar: %d, Foo.baz: %dn",foo.bar, foo.baz);

return 0;
}


The key is using the typedef in a header file.

Method 2: Hide one member of a struct

The second method uses a technique called aggregation, where one type is defined as a collection of other types or data. A struct or a relational database table aggregates types or data. Continuing from above, let’s add a pointer to Foo type to a new struct called Snafu:

struct _snafu  {
Foo * f;
int fubar;
};


Now we all the members of Foo hidden from all users of Snafu. We can’t really do anything with it, since there isn’t any memory allocation code written for Foo. If there were allocation code, the f could be set to a pointer value.

Simple inheritance

Just to be complete, Foo could be added like this, provided the definition of Foo was in scope (which it isn’t in this example):

struct _snafu {
Foo f;
int fubar;
};


Notice something else: because we declared Foo in the first position of Snafu, we could treat Snafu as a child class of Foo, such the Snafu inherits from Foo. But we’re not going to, we’ll deal with classes in a future article on object-oriented C programming. Choosing whether to declare the struct variable in whole or referencing it using a pointer depends on the purpose of the code, specifically, when, where and how the memory for Foo is going to handled. Memory handling is definitely going off the deep end for this article.

Incomplete types are powerful tools

In C, using incomplete types is necessary to encapsulate data necessary for one subsystem from being referenced by another. It’s possible to use this technique as part of a method for constructing object-oriented systems in pure C, including systems using “protected” classes where the struct internals are defined in a private header file thats used within the object system.

>>>NOTE: Here’s much longer article which goes into more detail on incomplete and derived types.

Data Hiding in C: Programming Using Incomplete and Derived Types

>>>NOTE: You may want to start with a much simpler article in hiding member in structs.

Data hiding and encapsulation in C is fairly easy using the notions of derived and incomplete types. A derived type is a user-defined type typically declared as members of a struct. An incomplete type is where the definition of a type is located within the scope as the type declaration. Incomplete types extremely useful for data hiding when type declaration is in the header file, and the definition and implementation of the type is shrouded in the C source file. For example, consider defining a type for handling paintings in an art collection:

struct painting {
uint32_t inventory_control;
uint32_t purchase_price;
char painting_name[256];
char artist_name[256];
};


This struct definition could be placed in a header file, say, painting.h, for inclusion into application code, then the members of the struct can be accessed as needed. But what happens when you need to change the members of the struct? Perhaps you would prefer to allocate the char buffers instead of declaring their sizes statically, and need to add a member for current owner:

struct _painting {
uint32_t inventory_control;
uint32_t purchase_price;
char * painting_name;
char * artist_name;
char * owner_name;
};


Now all the code depending on the first definition is broken. One way out of this bind is using incomplete types with accessor methods, just as you would use them in Java or C++.

In the header file painting.h, declare the painting type with a typedef:

typedef struct _painting Painting;


How you choose to capitalize, underscore or otherwise name “classes” is your business. Personally, I loathe the so-called “CamelCase” convention, but will use capital letters to denote a user-defined type.

Memory management

Managing memory is one of the most error-prone aspects of writing code in the C programming language. Using incomplete types, notice that you now have no way to directly allocate memory for your type in your application program, although you can deallocate using free() anywhere you have a Painting pointer declared. So allocation must be wrapped, and it makes really good sense to wrap the deallocation as well. For example, here is one way to do it:

Painting *
painting_new(void) {

Painting * p = (Painting *)malloc(sizeof(Painting));
memset(p,0xda,sizeof(Painting));
return p;
}

void
painting_delete(Painting * p) {

free(p->artist_name);
free(p->owner_name);
free(p->painting_name);
memset(p,0xdd,sizeof(Painting));
free(p);
}


So what’s going on here? Why the call to memset? And why different values when allocating (0xda) versus freeing (0xdd)? This technique is called “shredding” and I was first exposed to it in an excellent
book by Steve Maguire called “Writing Solid Code.”

The purpose of shredding is to set all the bytes in a struct (or other allocated set of bytes) to a value that has meaning to the programmer, but is otherwise nonexecutable nonsense. The value of this practice is suddenly realized when stepping through code in a debugger. Series of 0xdadadada indicate you are accessing an uninitialed field in the data structure. Similarly, series of 0xdddddddd indicate you are accessing memory that has been freed.

Defining accessors

Now, in the c file, define the _painting struct as above, and provide get/set methods for each member. For example, to get and set the values for artist_name, I write these methods as follows:

char *
painting_get_artist_name(Painting * p) {
return p->artist_name;
}

void
painting_set_artist_name(Painting * p, char * artist_name, size_t s) {
strncpy(p->artist_name,artist_name,s);
}


The appropriate prototypes for these methods are declared in the painting.h header file.

Again, how you choose to capitalize, underscore names and otherwise format your code is your business, but let’s take a closer look at my convention. First, I use the name of the struct (painting) to prefix all method calls that are publicly declared in the header file. I follow this with a verb (set or get) to indicate the action I want to perform, then the name of the member of the struct as object to the action, in this case “artist_name.” Note that I pass the length of the name in as a parameter for use in the strncpy function to guard against buffer overflow problems. (Where you get this length is your business as well).

Now, you have a header file that functions as an interface to your source code. You can add or delete members of any struct at any time without breaking your application code. To handle members that have been removed, you can signal error conditions in the appropriate get/set methods. Handling error conditions can be done in several ways, but that topic is outside the scope of this document at the moment.

Generate code automatically

Lastly, while all this code appears wordy, note that it’s pretty easy to write a code generator in any language that can handle regular expressions. I have written code generators in sh, perl and lua, which have variously taken key, value pairs or struct definitions as inputs. Developing an API with more than a dozen types of structs, each of which have 4 to 40 members makes automatic code generation time-effective.

Pitfalls

The approach above has some traps for the unwary, the most important of which is where and how to allocate memory for pointer fields in the struct. Several different approaches can be taken; I will investigate a couple in a future update to this post.