[Editor's Note: First posted on THE SHORTED TURN's WordPress.com site on 2010/02/21.]
Hello again, Gentle Reader,
Have you ever run across code littered with warnings to “keep this array synced with that enum!”? Have you ever needed not just the numeric value of an enumerator, but its actual name, perhaps for debugging via printf()s? Have you ever needed to serialize and deserialize enums in a portable way?
If you have come up against any of these problems, you’ve also probably exclaimed, “I’m SICK of Fighting The Tool! There MUST be a better way! Help me Shorted Turn, you’re my only hope!” Good news, Gentle Reader, there is indeed a better way:
X-Macros.
The Problem
Look at the code from any C or C++ project. Five’ll get you ten that you’ll find more than one instance of the following anti-pattern tragedy:thing.h:
// WARNING! DON'T CHANGE THIS ENUM WITHOUT ALSO CHANGING THE ARRAYS!!!!
typedef enum
{
E_ITEM_ONE,
E_ITEM_TWO,
E_ITEM_THREE
} TE_ITEM;
thing.c:
#include <stdio.h>
#include "thing.h"
// WARNING! DON'T CHANGE THIS ARRAY WITHOUT ALSO CHANGING THE ENUM!!!!
/// Text corresponding to the various E_ITEMs.
const char *f_thing_text[] =
{
"Item 1",
"Item 2",
"Item 3"
};
// Yeah this one too!
/// Text of the enum, in case we want to display the actual enum.
const char *f_thing_enum_as_strings_for_debug_use[] =
{
"E_ITEM_ONE",
"E_ITEM_TWO",
"E_ITEM_THREE"
};
// Don't forget this one!
/// Values to use for serializing the enum to the external world.
unsigned char f_external_representation[] =
{
0x42, // E_ITEM_ONE
0x05, // E_ITEM_TWO
0x98 // E_ITEM_THREE
};
////// Ok, now the functions that work with the enum and arrays.
/// Function for serializing the enum out of the system.
unsigned char SerializeEnum(TE_ITEM item)
{
return f_external_representation[item];
}
/// Function for deserializing an enum into the system.
TE_ITEM DeserializeEnum(unsigned char external_rep)
{
int i;
for(i=0; i<sizeof(f_external_representation); i++)
{
if(f_external_representation[i] == external_rep)
{
// We found it.
return (TE_ITEM)i;
}
}
// We didn't find it, return some sort of error.
return (TE_ITEM)0;
}
int main()
{
TE_ITEM e;
// Let's demonstrate with E_ITEM_TWO.
e = E_ITEM_TWO;
printf("X-Macro demonstration 1\n");
printf("=======================\n");
printf("Enum in use is: %s\n", f_thing_enum_as_strings_for_debug_use[e]);
printf("%s has the external representation %d\n", f_thing_enum_as_strings_for_debug_use[e],
(int)SerializeEnum(e));
printf("If we deserialized the number 0x42, we would get the enum %s\n",
f_thing_enum_as_strings_for_debug_use[DeserializeEnum(0x42)]);
return 0;
}
There are two main problems here:
- No amount of “// WARNING!! Keep this synced with …” comments will
prevent somebody, sometime, from getting the various arrays and enums
out of sync. That’s a Shorted Turn Guarantee.
- Multiple constructs which are all inextricably related are strewn about the codebase for no good reason. In this example, it’s four constructs spread across two files, but I have personally seen much worse. While inelegant to be sure, the real problem with this is that it only exacerbates problem #1. “Out of sight, out of mind,” so the old saying goes.
The Solution
X-Macros address both issues by putting all the related information into a single table in a single file, which is then cherrypicked to populate the various enums and arrays. This makes it impossible for the constructs to get out of sync. They do all their work at preprocessing-time, and hence incur no run-time overhead. Everybody wins!There are two basic X-Macro patterns: one which uses a separate file to hold the table and one which uses a #define to do so. There is little difference between the two, and they both beat the code above hands-down. I shall discuss the #define type first.
#define-based X-Macros
#define-based X-Macros store the table in a single function-style macro. This macro takes a parameter which is itself a macro, which I call a “picker macro”. The various fields of the table macro are extracted with this “picker macro” at the point they’re needed.Refactoring the example with a #define-based X-Macro, we get the following:
thing-define.h:
/// ITEM X-Macro table.
/// Fields are:
/// Enumerator, display text, external representation
/// Note that we don't have or need a column for printable enumerator names, i.e. "E_ITEM_ONE".
#define M_ITEM_TABLE(picker) \
picker(E_ITEM_ONE, "Item 1", 0x42) \
picker(E_ITEM_TWO, "Item 2", 0x05) \
picker(E_ITEM_THREE, "Item 3", 0x98)
typedef enum
{
// Define a "picker" macro to only pick out the enum name.
#define M_PICKER(enum_name, str, ext_rep) enum_name,
// The following will now expand to the same TE_ITEM definition we had in the original example code.
M_ITEM_TABLE(M_PICKER)
#undef M_PICKER
} TE_ITEM;
thing-define.c:#include <stdio.h>
#include "thing-define.h"
/// Text corresponding to the various E_ITEMs.
const char *f_thing_text[] =
{
// Define a "picker" macro to only pick out the strings.
#define M_PICKER(enum_name, str, ext_rep) str,
M_ITEM_TABLE(M_PICKER)
#undef M_PICKER
};
/// Text of the enum, in case we want to display the actual enum.
const char *f_thing_enum_as_strings_for_debug_use[] =
{
// A little fancier here: Define a "picker" macro to only pick out the enums, and stringize them so
// that we have an array of the string representation of the enumerators.
#define M_PICKER(enum_name, str, ext_rep) #enum_name,
M_ITEM_TABLE(M_PICKER)
#undef M_PICKER
};
/// Values to use for serializing the enum to the external world.
unsigned char f_external_representation[] =
{
#define M_PICKER(enum_name, str, ext_rep) ext_rep,
M_ITEM_TABLE(M_PICKER)
#undef M_PICKER
};
////// Ok, now the functions that work with the enum and arrays.
/// Function for serializing the enum out of the system.
unsigned char SerializeEnum(TE_ITEM item)
{
return f_external_representation[item];
}
/// Function for deserializing an enum into the system.
TE_ITEM DeserializeEnum(unsigned char external_rep)
{
int i;
for(i=0; i<sizeof(f_external_representation); i++)
{
if(f_external_representation[i] == external_rep)
{
// We found it.
return (TE_ITEM)i;
}
}
// We didn't find it, return some sort of error.
return (TE_ITEM)0;
}
int main()
{
TE_ITEM e;
// Let's demonstrate with E_ITEM_TWO.
e = E_ITEM_TWO;
printf("X-Macro demonstration 2\n");
printf("=======================\n");
printf("Enum in use is: %s\n", f_thing_enum_as_strings_for_debug_use[e]);
printf("%s has the external representation %d\n", f_thing_enum_as_strings_for_debug_use[e],
(int)SerializeEnum(e));
printf("If we deserialized the number 0x42, we would get the enum %s\n",
f_thing_enum_as_strings_for_debug_use[DeserializeEnum(0x42)]);
return 0;
}
Notice what we’ve gained:
- All the data is in a single table in a single location, namely the M_ITEM_TABLE #define. That alone will pay for itself the first time you’re asked “Hey Gentle Reader, what’s the external representation of E_ITEM_THREE again?”
- The enums and arrays are all in exactly the same places they were before, but since we’re extracting their contents from the M_ITEM_TABLE, it is now impossible for them to get out of sync.
- We’ve not materially increased the number of lines of code. In fact, given a data table larger than the three rows in our example, we would have considerably decreased the total number of lines of code.
“There is nothing so useless as doing efficiently that which should not be done at all.” – Peter Drucker“// WARNING! Keep this synced with…” should not be done at all.
Separate-File-Based X-Macros
The X-Macro technique as shown above is perfectly adequate for tables with relatively few entries. However, when your tables start getting large, you run into two issues:1. The line-continuations (“\”) become a pain. They prevent the use of “//”-style comments within your X-Macro table, and they’re easy to forget, which will break the build. Still, these are minor annoyances.
2. Since your table is one logical line, you may run into macro-length limits in your preprocessor and/or line-length limits in your compiler. This could be a deal-breaker.
Fortunately, there’s a variation on the above technique which neatly solves the above problems: the Separate-File-Based X-Macro. In this pattern, the table is kept it its own file, and instead of invoking the macro directly between the picker definitions, the X-Macro table file is #included wherever it’s needed. To wit:
thing-file.def:
/// ITEM X-Macro table, kept in its own separate file. /// Fields are: /// Enumerator, display text, external representation /// Note that we don't have or need a column for printable enumerator names, i.e. "E_ITEM_ONE". M_PICKER(E_ITEM_ONE, "Item 1", 0x42) M_PICKER(E_ITEM_TWO, "Item 2", 0x05) M_PICKER(E_ITEM_THREE, "Item 3", 0x98)
thing-file.h:
/// Our Enum
typedef enum
{
// Define a "picker" macro to only pick out the enum name.
#define M_PICKER(enum_name, str, ext_rep) enum_name,
// Now we include the .def file, where before we invoked the table's macro.
// The following will now expand to the same TE_ITEM definition we had in the original example code.
#include "thing-file.def"
#undef M_PICKER
} TE_ITEM;
thing-file.c:
#include <stdio.h>
#include "thing-file.h"
/// Text corresponding to the various E_ITEMs.
const char *f_thing_text[] =
{
// Define a "picker" macro to only pick out the strings.
#define M_PICKER(enum_name, str, ext_rep) str,
#include "thing-file.def"
#undef M_PICKER
};
/// Text of the enum, in case we want to display the actual enum.
const char *f_thing_enum_as_strings_for_debug_use[] =
{
// A little fancier here: Define a "picker" macro to only pick out the enums, and stringize them so
// that we have an array of the string representation of the enumerators.
#define M_PICKER(enum_name, str, ext_rep) #enum_name,
#include "thing-file.def"
#undef M_PICKER
};
/// Values to use for serializing the enum to the external world.
unsigned char f_external_representation[] =
{
#define M_PICKER(enum_name, str, ext_rep) ext_rep,
#include "thing-file.def"
#undef M_PICKER
};
////// Ok, now the functions that work with the enum and arrays.
/// Function for serializing the enum out of the system.
unsigned char SerializeEnum(TE_ITEM item)
{
return f_external_representation[item];
}
/// Function for deserializing an enum into the system.
TE_ITEM DeserializeEnum(unsigned char external_rep)
{
int i;
for(i=0; i<sizeof(f_external_representation); i++)
{
if(f_external_representation[i] == external_rep)
{
// We found it.
return (TE_ITEM)i;
}
}
// We didn't find it, return some sort of error.
return (TE_ITEM)0;
}
int main()
{
TE_ITEM e;
// Let's demonstrate with E_ITEM_TWO.
e = E_ITEM_TWO;
printf("X-Macro demonstration 3\n");
printf("=======================\n");
printf("Enum in use is: %s\n", f_thing_enum_as_strings_for_debug_use[e]);
printf("%s has the external representation %d\n",
f_thing_enum_as_strings_for_debug_use[e], (int)SerializeEnum(e));
printf("If we deserialized the number 0x42, we would get the enum %s\n",
f_thing_enum_as_strings_for_debug_use[DeserializeEnum(0x42)]);
return 0;
}
Now the table can have an arbitrary number of rows, regardless of any line-length limitations of your compiler, and you can comment each row with C++-style comments to your heart’s content. I pesonally prefer this style over the #define-based style for these very reasons.
One important safety tip: don’t be tempted to add the standard header include guard to the .def file. This file is explicitly intended to be included as many times as necessary in the same file, and be re-preprocessed each time.
Bonus: X-Macros increase runtime efficiency
While it’s all well and good that X-Macros bulletproof our constant data tables without incurring any runtime cost, it’s too bad that we can’t actually gain performance by using them. OH WAIT, WE CAN! Yes Gentle Reader, like you, I don’t particularly like that for() loop in all the versions of DeserializeEnum() above either. It turns out that with this X-Macro technique, we can eliminate the loop entirely, and turn it into a switch statement:TE_ITEM DeserializeEnum(unsigned char external_rep)
{
switch(external_rep)
{
#define M_PICKER(enum_name, str, ext_rep) case ext_rep: return enum_name;
#include "thing-file.def"
#undef M_PICKER
// We didn't find it, return some sort of error.
default: return (TE_ITEM)0;
}
}
Now, we have to make a small leap of faith that the compiler will implement a switch/case in a more efficient way than a loop over the elements of an array, but even with a totally naive compiler, it shouldn’t be any worse.
Conclusion
X-Macros are a truly powerful solution to some otherwise-intractable problems. They have the additional benefit of forcing the Tool to Fight itself for once, which brings its own sweet, sweet satisfaction.Add X-Macros to your ever-growing arsenal of coding methods, don’t look back, and Increase Your Code Power!
No comments:
Post a Comment