2 November 2010

Subscribe to Engineering Growth

Stay up to date on new essays and updates on Growth Engineering.

TL; DR: want to auto-generate .h files from .c files? Type this into vim.

C sucks. [Disclaimer: OK, I don't like C; I'm a python/functional languages guy. You may love it. Whatever.]

There's lots of little annoying things about C; memory management, using #ifdefs for header file declarations, manually written header files, etc. The good news is, I've half-solved the last issue.  

Say you have a simple main file, as follows

int main(int argc, char ** argv) {
  char* str = malloc(sizeof(char*) * 3);
  str[0] = getFirstChar();
  str[1] = 'F';
  str[2] = '\0';
  printf("%s", str);
  free(str);
  return 0;
}
char getFirstChar() {
  return 'A';
}

Dead-simple. Unfortunately, you get the following issue when trying to compile:

main.c:11: error: conflicting types for getFirstChar
main.c:3: note: previous implicit declaration of getFirstChar was here

Here's why this happens: the C compiler parses your c file once. The first time it sees getFirstChar, it realizes that it hasn't seen a method declaration for it yet, so it creates an implicit one. Unfortunately, an implicit declaration, or stub, will always assume a return type of int, while the actual type is a char. So, type mismatch.

There are a couple of ways to solve this problem:

  1. Put getFirstChar above of main. This kind of works, but now you've got to order your functions based on some partial order you can't control. If you've got an instance of two functions calling one another in some proto-co-recursive way, this won't work at all. In general, less flexibility than you'd like.
  2. Declare the method headers at the top of the file/create a header file
    int main(int argc, char** argv);
    char getFirstChar();
    ...
    #include "main.h"
    This it the 'proper' way to solve the problem, but now you've got a header file you've got to keep current. If you're the kind of person to plan all of your code out before you write it, you're paying a one-time cost to create a header file. If you're making updates, however, or if you're like me and you figure out what methods you should have as you go along, this means you're updating the header files every time you edit a method name, just to freaking compile, all to solve a problem other languages don't even have.

The solution

In its most basic iteration, a header file is just your method declarations w/ semicolons instead of open braces. It shouldn't be that hard to just auto-generate the header file whenever you make a difference, or whenever you compile. So that's what I started doing. I'm a vim guy. With a bit of research and poking around, here's a quick command you can run in vim to generate a simple .h file from a given .c file:

:%s/^\(\w.*{\t*$\)\@!.*\n//
then:
:%s/ {/;

[Disclaimer: Do not use this in its exact form for a serious project. Header files are a bit more complicated than the version we're dealing with.  The current proposal is best suited for homework assignments or quick one-off projects]

What this does: The commands above remove all lines that don't look like a method declaration, then swaps open braces for semicolons. Note that it assumes that you don't indent your method declarations and that you inline the opening brace of function declarations. To use the commands, just copy the .c file to a .h, open the .h and run those commands.

Next Steps

I don't have to write any more C for another week or so, but I think the following steps make sense for taking .h file generation to the next level:

  1. Rewrite as shell or python script.  Switching from vim commands to python is probably going to add some flexibility.
  2. Incorporate the script directly into a Makefile.  In this scenario, make will depend on header files being re-generated, and only then call GCC.
  3. Include some of the other stuff .h files typically include This current approach makes the assumption that all relevant comments, #includes, structs and typedefs in your program belong in the .c file exclusively. Typically, at least some of those belongs in .h instead. A future version would potentially modify ensure that previous .h files are not overridden but instead amended. 

Hopefully the commands will come in handy to others as spoiled by HLL as me, who can't believe that C doesn't already to these things for you.

Thoughts/feedback?

Tags: #technical #code