MultiProcessing vs MultiThreading

As years go by the computational need grows as we tend to handle big data for processing, analysing and visualising and simulating complex systems. It’s imminent to understand Multiprocessing and Multithreading to utilise our computing resources efficiently.

Process & Threads

Before getting into multiprocessing and multithreading asap, let’s have a look at some basics.

  • Process is an independent instance executed in a processor core(kernel).
  • Threads are components of a process, and they run concurrently.

Lets visually decode the terms defined above. Below figure shows the schematic representation of process instance per core and threads per process for a dual-core processor.

You might think of how many cores does our system have? How many threads does each core/kernel have?

A simple answer for the Windows user is to look into the task manager.

The above figure is my system processor information. I have highlighted the information we need from it. My system has Intel i7 Quad-core processor with a clock speed of 2,81 GHz. We can see the Cores count being 4(Quad-core) and Logical processors being 8. Logical Processors aren’t the original processors, its a result of Intel’s hyperthreading. Cores count gives us the right processors count and logical processors tells us the number of threads per core. In our case, it’s 2 (logical processors/cores count). Modern CPUs tend to have two threads per core, whereas GPUs has around ten threads per core.

Characteristics of Processes and threads

  • Processes do not share the same memory space, while Threads do.
  • As the Threads share the same memory inside a process, Threads are lighter and possess less overhead. Hence, they are faster and safer to share data.

Concurrency and Parallelism

Before moving into Multiprocessing and Multithreading, we should also look into Concurrency and Parallelism.

  • Concurrency is when two or more tasks start, run and complete in the same period. But it doesn’t necessarily mean that both were running at the same instant. 
  • Parallelism is when two or more tasks start, run simultaneously.

Let’s see a real-life example to understand Concurrency and Parallelism. We all would have experience waiting in the government offices to get some documents done. Consider a scenario where you need to go to the government office to get the registration done, but you have an online lecture lined up.

In terms of Concurrency, you can go to the government office first get the token and attend the online lecture on your mobile phone during your waiting period. Whereas, in terms of Parallelism, you can ask your friend to get the token and you can stay home for lecture until your token number gets in the closer proximity.

Rule of Thumb:

Multithreading implements Concurrency; Multiprocessing implements Parallelism.

Multiprocessing and Multithreading

Multiprocessing refers to utilising multiple processors/CPUs in a single system. These processors handle multiple processes simultaneously. It helps us increase the computing capability of the system.

Multithreading refers to multiple threads being executed by a single processor concurrently (Recollect the concurrency example above). It helps to increase the output of a single processor.

The below figure shows an example of how multithreading is handled in simulation software.

Characteristics of Multiprocessing

Multiple processors handle multiple processes simultaneously increasing the computing power. Implements parallelism.

Pros
  • Takes advantage of multiple processors in the system, thus increasing computing power.
  • Separate memory and resources get allocated for each process or program.
  • Must for CPU-bound processing.
  • Avoids GIL limitations in case of python language.
Cons
  • Larger memory overhead which makes it bulky and slower.
  • Creating a process is exhaustive and time consuming.

Characteristics of Multithreading

Multiple threads of a processor concurrently handle a process increasing the processor output. Implements concurrency.

Pros
  • Shared memory makes access easier.
  • Multithreading is lightweight owing to its shared memory capability.
  • A great option for I/O-bound applications.
  • The responsive user interface becomes feasible.
Cons
  • GIL limitation makes multithreading infeasible for programs run in python.

I/O Bound tasks & CPU Bound tasks

I/O bound tasks

Consider a scenario, where a block of the code expects the user input or any response/input from the other section. In this scenario, if we use multiprocessing to handle different code blocks, then the dedicated computing resources are wasted on code blocks, which remains idle until it gets input. In such scenarios, multithreading can concurrently handle different blocks with no extra computing resources. Hence, multithreading is better off in handling GUIs or any I/O based tasks.

CPU bound tasks

In the scenario, where we process huge dataset, time and resources are the bottlenecks as the massive data floods the memory. It demands the increase in computational power, which the multiprocessing offers. Image and video processing are examples of CPU bound tasks. GPUs are designed to handle such tasks with its massive computing power(number of cores are in the ranges of 1000s).

Conclusion

I have tried to cover the theory behind the multiprocessing and multithreading in this post. In the next following post, I will discuss the practical implementation of it using python. Also, verify the memory sharing ability of multithreading.

References

  1. https://stackoverflow.com/questions/3044580/multiprocessing-vs-threading-python
  2. https://www.guru99.com/difference-between-multiprocessing-and-multithreading.html
  3. https://realpython.com/python-gil/
  4. https://towardsdatascience.com/multithreading-vs-multiprocessing-in-python-3afeb73e105f

  • Parallel Processing
  • Revan Kumar
  • Comments :
      3 Comments
    • Eashwar

      First off, a very nice try and hats off to your effort. I really do not understand the difference between a process and a thread. I can see that you have provided some definitions for the same, but what do they mean in practice. How does the code you write end up being considered as a thread or as a process for your CPU? Any insight on that would be great

      1
      0
      • Thanks. I can understand your point. I refrained myself from getting into python implementation in this post as I wanted to write a theoritical explanation and then follow it up with a good practical implementation in python. I will soon post it.

        0
        0
      • Revan Kumar

        To make you clear about threads and process, think through what will be happening when you open a software lets say a word document. The moment you open a word document, CPU allocates memory to it. The memory allocated varies based on the nature of software. This is what we call a process where an instance of a program is created in the CPU processor with a share of memory allocated to it. To run a process successfully, in our case word document, you need it to be responsive to your input devices like mouse, keyboard and perform various actions based on the features selected (like creating a new document, finding text, replacing, saving the document). Now to handle these actions within a program, CPU virtually creates multiple components inside a process which is called as threads and hence all the threads refer to the same memory. I will show this difference in memory allocation by CPU in the next post where I will be executing tasks as threads and process in python. Hope it clears your doubt.

        0
        0

    Leave A Comment :
    *Registration isn't necessary

    Related Post

    Translate »