What is doing this code? Jim Hague code — preprocessing in C

What is doing this code? Jim Hague code — preprocessing in C

If you are learning C programming language. One of the topics that you should learn is “preprocessing”. Preprocessing is the first step in the compilation process where the compiler eliminates comments and expands macros.

A great code written in 1896 by Jim Hague is perfect to understand this concept. They wrote the following code:

No alt text provided for this image



At first sight. It is practically impossible to understand it, but if you know how the #define statement works, you could “de-obfuscate” the code.

First, we should compile the code and run the executable file. Our file is called hague.c and we will use gcc to compile it.

gcc hague.c -o h

The gcc program returns some warnings, but the executable file is created. Now when the file is running, we will see something like this.

No alt text provided for this image


If I write something like “what is this” the program prints a series of dots and dash. What could mean this?. Well, if we compare the answer with the morse code, we can see that it matches perfectly.

No alt text provided for this image



The tree above is a great tool to translate any message into morse code. You just have to move left (DAH or Dash) or right (DIT or Dot) to figure out the message. So, the purpose of this code is to translate any input into morse code.

Now, It is time to transform the code and understand it.

If the preprocessor eliminates comments and expands macros, we can use it to expand the #define preprocessor directive, thus do it more readable.

we use the gcc command with -E flag.

gcc -E hague.c -o hague.pr

The new file hague.pr looks like this:

No alt text provided for this image



It looks better, doesn’t it?

We can see that the code uses a string full of letters and symbols. This is the base to create the relation with the morse symbols. Is this string familiar?

No alt text provided for this image


Yes, It represents the tree that helps us to translate into morse. (For instance, the letters E and T are the tree roots)

Our next step is sort the code (indent). In that way we can improve its readability.

No alt text provided for this image


Much better. Now it is easier to recognize some structures, but to continue we need to replace some names. _DIT will be DOT, DAH_ will be DASH, DIT_ will be DOTS and _DIT_ will be DOTSDOTS. Also we need to change the functions names _DAH and _DIT into TRANSFORM_TO_MORSE and PRINT_MORSE.

No alt text provided for this image


Now, our file has the shape and structure of a readable c file.

We have 2 extra functions. PRINT_MORSE. and TRANSFORM_TO_MORSE.

PRINT_MORSE uses write function to print one character (one dot or one dash).

TRANSFORM_TO_MORSE. uses PRINT_MORSE, but its main purpose is key. It transforms every input character into morse code. Also It is a recursive function. When DOTS_ > 3 It is recursive moving DOTS_ 1 bit right, else prints zero.

MAIN function uses 3 nested loops. The inner loop checks if the input contains characters and changes lower case characters into upper case. The middle loop uses the TRANSFORM_TO_MORSE function when the input has characters that are part of _DAH_[] array. The external loop prints a new line when the input finishes.

In this way we can “de-obfuscate” the code. This code is a good exercise to understand the power of macros in c, but also shows the problems when we abuses of them.

To view or add a comment, sign in

More articles by Anas Ferchichi

  • Face Detection

    What is face detection Face detection is a computer vision technique that involves identifying and locating human faces…

  • Ethical Challenges in AR / VR

    The Impact of Virtual Reality and Immersive Experiences on Users and Society Virtual reality (VR) and augmented reality…

  • Unity Interface

    What is Unity Unity is a popular game development engine that allows developers to create amazing 2D and 3D games. The…

  • What's the Big Deal?

    What does “STEM” mean? Let’s start with a basic question: Exactly what does STEM mean? It's a term many are familiar…

  • My first postmortem
  • What happens when you type google.com in your browser and press Enter !!!

    Web Stack A Web stack is the collection of software required for Web development. At a minimum, a Web stack contains an…

  • knowledge - IoT

    What is IoT?: The Internet of Things (IoT) describes the network of physical objects—“things”—that are embedded with…

  • Share your knowledge - Recursion

    What is recursion? The process in which a function calls itself directly or indirectly is called recursion and the…

  • How object and class attributes work

    A class is a custom data type, In python, we can create custom classes that are fully integrated and that can be used…

  • static and dynamic libraries

    Why use libraries in C? Libraries in C are not unlike public libraries in cities, towns, or neighborhoods. A public…

Insights from the community

Others also viewed

Explore topics