What is doing this code? Jim Hague code — preprocessing in C
If you are learning C programming language. One of the topics that you should learn is “preprocessing”. Preprocessing is the first step in the compilation process where the compiler eliminates comments and expands macros.
A great code written in 1896 by Jim Hague is perfect to understand this concept. They wrote the following code:
At first sight. It is practically impossible to understand it, but if you know how the #define statement works, you could “de-obfuscate” the code.
First, we should compile the code and run the executable file. Our file is called hague.c and we will use gcc to compile it.
gcc hague.c -o h
The gcc program returns some warnings, but the executable file is created. Now when the file is running, we will see something like this.
If I write something like “what is this” the program prints a series of dots and dash. What could mean this?. Well, if we compare the answer with the morse code, we can see that it matches perfectly.
The tree above is a great tool to translate any message into morse code. You just have to move left (DAH or Dash) or right (DIT or Dot) to figure out the message. So, the purpose of this code is to translate any input into morse code.
Now, It is time to transform the code and understand it.
If the preprocessor eliminates comments and expands macros, we can use it to expand the #define preprocessor directive, thus do it more readable.
we use the gcc command with -E flag.
gcc -E hague.c -o hague.pr
The new file hague.pr looks like this:
Recommended by LinkedIn
It looks better, doesn’t it?
We can see that the code uses a string full of letters and symbols. This is the base to create the relation with the morse symbols. Is this string familiar?
Yes, It represents the tree that helps us to translate into morse. (For instance, the letters E and T are the tree roots)
Our next step is sort the code (indent). In that way we can improve its readability.
Much better. Now it is easier to recognize some structures, but to continue we need to replace some names. _DIT will be DOT, DAH_ will be DASH, DIT_ will be DOTS and _DIT_ will be DOTSDOTS. Also we need to change the functions names _DAH and _DIT into TRANSFORM_TO_MORSE and PRINT_MORSE.
Now, our file has the shape and structure of a readable c file.
We have 2 extra functions. PRINT_MORSE. and TRANSFORM_TO_MORSE.
PRINT_MORSE uses write function to print one character (one dot or one dash).
TRANSFORM_TO_MORSE. uses PRINT_MORSE, but its main purpose is key. It transforms every input character into morse code. Also It is a recursive function. When DOTS_ > 3 It is recursive moving DOTS_ 1 bit right, else prints zero.
MAIN function uses 3 nested loops. The inner loop checks if the input contains characters and changes lower case characters into upper case. The middle loop uses the TRANSFORM_TO_MORSE function when the input has characters that are part of _DAH_[] array. The external loop prints a new line when the input finishes.