More about Fork() in Linux
As most of know, that we can use the fork() call, to create a new child which is a duplicate copy of its parent process, and on the execution of fork() it returns the PID of the child in the parent process and returns zero in the child process.
pid_t fork(void); // declaration of the fork // return type is pid_t
The return type for fork() is a opaque type pid_t.
Furthermore, we know that on failure, the same fork() returns -1 in the parent's context, and the specific ERRNO is also set for the same.
Internally, fork() calls the clone() system call, sequence of the same is something like this fork() -> clone() -> do_fork() -> copy_process() -> dup_task_struct() ----- > and so on, my main point here is that the given system call takes many flags which determine the behavior and the types of resources which are shared between the parent and the child, some of these flags are:
CLONE_FILES // parent and child will share the open files CLONE_VS // parent and child will share the address space
Hence, with the understanding from the above case, it is clear that the creation of a child process and a thread is done through a similar process, it's the type of FLAGS passed to the system call which determine that whether the given task is a child process or a new thread within the same process.
Now one of the main concern we use to have earlier is that which process will run first after the fork() has returned, since the execution point for both the parent and child will be same after the fork() has returned successfully, we use to have the misconception that it is decided by the kernel internally.
However, there is something interesting going behind the scene, as it is said that the child gets the exact copy of the parent's address space, this is true to some extent, but the copy is not made immediately after the fork(), let me provide you a scenario,
Think of a process wherein the size of process is of around 20MB and the program is written in a way that upon user's request, our main thread should open the Firefox.
Since, the same can be implemented using the combination of fork() a execl() family of calls, now since, our main thread just needs to open the Firefox browser then what is the need of copying the complete address space of 20MB into the child's address space.
This will be very inefficient and useless as the new child will be invoking a new process and that new process will be having its own address space where the Firefox browser will run, so there is no point in copying the complete address space in the newer address space created for the child and clearing the same after the Firefox has been started.
This is implemented with the technique called COW (copy on write), wherein the child is provided with the exact copy of parent only when the child preforms some write operation in the parent's address space, till then the parent's address space is shared to the child with read only access.
In case the child tires to perform a write operation in the parent's address space then a page fault happens and the page fault handler() looks whether the child has the rights to write into the given address space (which it does in its own address space), at that time a fresh copy of complete address space is created for the child.
Hence, we can conclude that kernel allows the child to run first after the fork(), in case child process wishes to open a new process in the system using the execl() family of calls, if it does then there is no copying of address space is performed and if makes the child creation time and process execution time much more faster then other OS available in the marker, since most of the other operating systems like Windows, Mac implements the fork() and execl() in a single function call and so far the process creation timings for a Linux machine still holds faster than the other OS available.
Article by Aman Kanwar
AUTOMOTIVE | BSP | RTOS | Embedded Systems | C | Linux | CANJ1939
5yDetailed explanation.. One can get more clarity regarding working of fork() after reading this.
Embedded Software Engineer IV @Cisco | IBM SME Trainer | IIoT | AI/ML | Edge ML | FOTA | ARM | Linux Kernel | RTOS | #ShivamCDAC
5y#MustRead Article! Thanks Aman Kanwar ! Will be waiting for more such articles.
Sr. Software System Designer (Security) @ AMD
5yVery informative! Keep posting more such articles Aman Kanwar
Senior Embedded Engineer | EdgeQ
5yVoila! My friend.This is very informative article.
#ThatKiCADguy :: Content creator :: HW designer :: (He/Him)
5yGreat article, Aman. I have read it all at once.