More about Fork() in Linux

More about Fork() in Linux

As most of know, that we can use the fork() call, to create a new child which is a duplicate copy of its parent process, and on the execution of fork() it returns the PID of the child in the parent process and returns zero in the child process.

       pid_t fork(void);   // declaration of the fork
                           // return type is pid_t

The return type for fork() is a opaque type pid_t.

Furthermore, we know that on failure, the same fork() returns -1 in the parent's context, and the specific ERRNO is also set for the same.

Internally, fork() calls the clone() system call, sequence of the same is something like this fork() -> clone() -> do_fork() -> copy_process() -> dup_task_struct() ----- > and so on, my main point here is that the given system call takes many flags which determine the behavior and the types of resources which are shared between the parent and the child, some of these flags are:

CLONE_FILES  // parent and child will share the open files
CLONE_VS     // parent and child will share the address space 

Hence, with the understanding from the above case, it is clear that the creation of a child process and a thread is done through a similar process, it's the type of FLAGS passed to the system call which determine that whether the given task is a child process or a new thread within the same process.

Now one of the main concern we use to have earlier is that which process will run first after the fork() has returned, since the execution point for both the parent and child will be same after the fork() has returned successfully, we use to have the misconception that it is decided by the kernel internally.

However, there is something interesting going behind the scene, as it is said that the child gets the exact copy of the parent's address space, this is true to some extent, but the copy is not made immediately after the fork(), let me provide you a scenario,

Think of a process wherein the size of process is of around 20MB and the program is written in a way that upon user's request, our main thread should open the Firefox.

Since, the same can be implemented using the combination of fork() a execl() family of calls, now since, our main thread just needs to open the Firefox browser then what is the need of copying the complete address space of 20MB into the child's address space.

This will be very inefficient and useless as the new child will be invoking a new process and that new process will be having its own address space where the Firefox browser will run, so there is no point in copying the complete address space in the newer address space created for the child and clearing the same after the Firefox has been started.

This is implemented with the technique called COW (copy on write), wherein the child is provided with the exact copy of parent only when the child preforms some write operation in the parent's address space, till then the parent's address space is shared to the child with read only access.

In case the child tires to perform a write operation in the parent's address space then a page fault happens and the page fault handler() looks whether the child has the rights to write into the given address space (which it does in its own address space), at that time a fresh copy of complete address space is created for the child.

Hence, we can conclude that kernel allows the child to run first after the fork(), in case child process wishes to open a new process in the system using the execl() family of calls, if it does then there is no copying of address space is performed and if makes the child creation time and process execution time much more faster then other OS available in the marker, since most of the other operating systems like Windows, Mac implements the fork() and execl() in a single function call and so far the process creation timings for a Linux machine still holds faster than the other OS available.

Article by Aman Kanwar

Sagar Khairnar

AUTOMOTIVE | BSP | RTOS | Embedded Systems | C | Linux | CANJ1939

5y

Detailed explanation.. One can get more clarity regarding working of fork() after reading this.

Shivam Gupta

Embedded Software Engineer IV @Cisco | IBM SME Trainer | IIoT | AI/ML | Edge ML | FOTA | ARM | Linux Kernel | RTOS | #ShivamCDAC

5y

#MustRead Article! Thanks Aman Kanwar ! Will be waiting for more such articles.

Ajay Rajan

Sr. Software System Designer (Security) @ AMD

5y

Very informative! Keep posting more such articles Aman Kanwar

Aman Sharma

Senior Embedded Engineer | EdgeQ

5y

Voila! My friend.This is very informative article.

Petr Dvořák

#ThatKiCADguy :: Content creator :: HW designer :: (He/Him)

5y

Great article, Aman. I have read it all at once.

To view or add a comment, sign in

More articles by Aman Kanwar

  • Applications of 8051 Timers

    I remember the time, I use to work on 8051. It was a good learning point for embedded systems and programming the same.

    1 Comment
  • Self-printing C program

    Hello my LinkedIn friends, I was thinking of the interview I had in the past wherein the interviewer asked me to write…

  • Logical Address and Virtual Address (Part II)

    This is the second part of this topic, in case you haven't read the previous part of this article, then click on the…

    8 Comments
  • Logical Address and Virtual Address (Part I)

    This was one of the most confusing topics when I started with Linux. As of today many of you guys would have also found…

    18 Comments
  • ARM Cortex R series

    ARM based processors and silicon chips are know as one of the industry's power efficient and high computation devices…

  • AUTOSAR

    AUTOSAR is a kind of standard in the automobile market which provides the facility of a generic approach towards the…

Insights from the community

Others also viewed

Explore topics