Friday, January 16, 2015

process vs thread difference

very great difference:

The major differences between threads and processes are:
  1. Threads share the address space of the process that created it; processes have their own address space.
  2. Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
  3. Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
  4. Threads have almost no overhead; processes have considerable overhead.
  5. New threads are easily created; new processes require duplication of the parent process.
  6. Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
  7. Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process do not affect child processes.
From:https://stackoverflow.com/questions/200469/what-is-the-difference-between-a-process-and-a-thread

First, let's look at the theoretical aspect. You need to understand what a process is conceptually to understand the difference between a process and a thread and what's shared between them.
We have the following from section 2.2.2 The Classical Thread Model in Modern Operating Systems 3e by Tanenbaum:
The process model is based on two independent concepts: resource grouping and execution. Sometimes it is use­ful to separate them; this is where threads come in....
He continues:
One way of looking at a process is that it is a way to group related resources together. A process has an address space containing program text and data, as well as other resources. These resource may include open files, child processes, pending alarms, signal handlers, accounting information, and more. By putting them together in the form of a process, they can be managed more easily. The other concept a process has is a thread of execution, usually shortened to just thread. The thread has a program counter that keeps track of which instruc­tion to execute next. It has registers, which hold its current working variables. It has a stack, which contains the execution history, with one frame for each proce­dure called but not yet returned from. Although a thread must execute in some process, the thread and its process are different concepts and can be treated sepa­rately. Processes are used to group resources together; threads are the entities scheduled for execution on the CPU.
Further down he provides the following table:
Per process items             | Per thread items
------------------------------|-----------------
Address space                 | Program counter
Global variables              | Registers
Open files                    | Stack
Child processes               | State
Pending alarms                |
Signals and signal handlers   |
Accounting information        |
Let's deal with the hardware multithreading issue. Classically, a CPU would support a single thread of execution, maintaining the thread's state via a single program counter, and set of registers. But what happens if there's a cache miss? It takes a long time to fetch data from main memory, and while that's happening the CPU is just sitting there idle. So someone had the idea to basically have two sets of thread state ( PC + registers ) so that another thread ( maybe in the same process, maybe in a different process ) can get work done while the other thread is waiting on main memory. There are multiple names and implementations of this concept, such as HyperThreading and Simultaneous Multithreading ( SMT for short ).
Now let's look at the software side. There are basically three ways that threads can be implemented on the software side.
  1. Userspace Threads
  2. Kernel Threads
  3. A combination of the two
All you need to implement threads is the ability to save the CPU state and maintain multiple stacks, which can in many cases be done in user space. The advantage of user space threads is super fast thread switching since you don't have to trap into the kernel and the ability to schedule your threads the way you like. The biggest drawback is the inability to do blocking I/O ( which would block the entire process and all it's user threads ), which is one of the big reasons we use threads in the first place. Blocking I/O using threads greatly simplifies program design in many cases.
Kernel threads have the advantage of being able to use blocking I/O, in addition to leaving all the scheduling issues to the OS. But each thread switch requires trapping into the kernel which is potentially relatively slow. However, if you're switching threads because of blocked I/O this isn't really an issue since the I/O operation probably trapped you into the kernel already anyway.
Another approach is to combine the two, with multiple kernel threads each having multiple user threads.
So getting back to your question of terminology, you can see that a process and a thread of execution are two different concepts and your choice of which term to use depends on what you're talking about. Regarding the term "light weight process", I don't personally see the point in it since it doesn't really convey what's going on as well as the term "thread of execution".


reference:

thanks to robert.s.barnes

http://stackoverflow.com/questions/200469/what-is-the-difference-between-a-process-and-a-thread

No comments:

Post a Comment