Why would you recursively define macros in terms of other macros in C?


I wanted to see how the arduino function digitalWrite actually worked. But when I looked for the source for the function it was full of macros that were themselves defined in terms of other macros. Why would it be built this way instead of just using a function? Is this just poor coding style or the proper way of doing things in C?

For example, digitalWrite contains the macro digitalPinToPort.

#define digitalPinToPort(P) ( pgm_read_byte( digital_pin_to_port_PGM + (P) ) )

And pgm_read_byte is a macro:

#define pgm_read_byte(address_short)    pgm_read_byte_near(address_short)

And pgm_read_byte_near is a macro:

#define pgm_read_byte_near(address_short) __LPM((uint16_t)(address_short))

And __LPM is a macro:

#define __LPM(addr)         __LPM_classic__(addr)

And __LPM_classic__ is a macro:

#define __LPM_classic__(addr)   \
(__extension__({                \
    uint16_t __addr16 = (uint16_t)(addr); \
    uint8_t __result;           \
    __asm__ __volatile__        \
    (                           \
        "lpm" "\n\t"            \
        "mov %0, r0" "\n\t"     \
        : "=r" (__result)       \
        : "z" (__addr16)        \
        : "r0"                  \
    );                          \
    __result;                   \

Not directly related to this, but I was also under the impression that double underscore should only be used by the compiler. Is having LPM prefixed with __ proper?

Show source
| C   | arduino   2017-01-07 19:01 2 Answers

Answers ( 2 )

  1. 2017-01-07 19:01

    Why would it be built this way instead of just using a function?

    The purpose is likely to optimize functions' calls using pre-C99 compilers, that don't support inline functions (via inline keyword). This way, the whole stack of function-like macros is essentially merged by preprocessor into single block of code.

    Every time you call a function in C, there is a tiny overhead, because of jumping around program's code, managing the stack frame and passing arguments. This cost is negligible in most applications, but if the function is called very often, then it may become a performance bottleneck.

    Is this just poor coding style or the proper way of doing things in C?

    It's hard to give definitive answer, because coding style is subjective topic. I would say, consider using inline functions or even (better) let the compiler inline them by itself. They are type-safe, more readable, more predictable, and with the proper assistance from compiler, the net result is essentially the same.

    Related reference (it's for C++, but the idea is generally the same for C):

  2. 2017-01-07 20:01

    If your question is "why one would use multiple layers of macros?", then:

    • why not? Notably in the pre-C99 era without inline. A canonical example was 1980s era getc which was (IIRC) at that time (SunOS3.2, 1987) documented as being a macro, with a NOTE in the man page telling (I forgot the details) that with some FILE* filearr[]; a getc(filearr[i++]) was wrong (IIRC, the undefined behavior terminology did not exist at that time). When you looked into some system headers (e.g. <stdio.h> or some header included by it) you could find the definition of such macro. And at that time (with computers running only at a few MHz, so thousand times slower than today) getc has to be a macro for efficiency reasons (since inline did not exist, and compilers did no interprocedural optimizations like they are able to do now). Of course, you could use getc in your own macros.
    • even today, some standards define macros. In particular today's waitpid(2) system call document WIFEXITED & WEXITSTATUS as macros, and it is sensible to #define some of your macros mixing both of them.
    • the main point is to understand the working of the C preprocessor, and its profoundly textual (so very brittle) nature. This is explained in all textbooks about C. Hence you need to understand what is happening under the hoods.
    • the rule of thumb with modern era C (that is C99 & C11) is to systematically prefer having some static inline function (defined in some header file) to an equivalent macro. In other words, #define some macro only when you cannot avoid it. And explicitly document that fact.
    • several layers of macro might (sometimes) increase code readability.
    • macros can be tested with #ifdef macroname which is sometimes useful.

    Of course, when you dare defining several layers of macros (I won't call them "recursive", read about self-referential macros) you need to be very careful and understand what is happening with all of them (combined & separately). Looking into the preprocessed form is helpful.

    BTW, to debug complex macros, I sometimes do

    gcc -C -E -I/some/directory mysource.c | sed 's:^#://#:' > mysource.i

    then I look into the mysource.i and sometimes even have to compile it, perhaps as gcc -c -Wall mysource.i to get warnings which are located in the preprocessed form mysource.i (which I can examine in my emacs editor). The sed command is "commenting" the lines starting with # which are setting source location (à la #line).... Sometimes I even do indent mysource.i

    (actually, I have a special rule for that in my Makefile)

    Is having LPM prefixed with __ proper?

    BTW, names starting with _ are (by the standard, and conventionally) reserved to the implementation. In principle you are not allowed to have your names starting with _ so there cannot be any collision.

    Notice that the __LPM_classic__ macro uses the statement-expr extension (of GCC & Clang)

    Look also at other programming languages. Common Lisp has a very different model of macros (and a much more interesting one). Read about hygienic macros. My personal regret is that the C preprocessor has never evolved to be much more powerful (and Scheme-like). Why that did not happen (imagine a C preprocessor able to invoke Guile code to do the macro expansion!) is probably for sociological & economical reasons.

    You still should sometimes consider using some other preprocessor (like m4 or GPP) to generate C code. See autoconf.

◀ Go back