But for the rest of us, it is. Most servers today are sold as multiprocessor units, with dual CPUs and more in a single system--designed to perform at unprecedented speeds. What is the real difference among the "multis" (multitasking, multiprocessing, multiprogramming and multithreading)? How do they fit together? And how do "multis" affect operating system features such as I/O and CPU scheduling of processes?
Threads and Processes
Before we can discuss multithreading and multiprocessing, we must define threads and processes. A thread is not only what your sanity is hanging by, but also a single flow of execution or the execution of a sequential set of instructions. A process is a collection of one or more threads, and--you guessed it--an application is a collection of one or more processes.
In a single-threaded application, only one thread exists for the entire program, which dictates that only one instruction can be executed at a time. A Web server that accepts a client connection and fills the request before accepting another connection is single threaded. Multithreading is the practice of using multiple threads within a single process, thus offering multiple points of execution--one per thread--and the ability to perform multiple tasks at one time. This model can perform concurrent instructions, increasing the instructional throughput of the application. A Web server that accepts a client connection and uses multiple threads to fill the request more quickly would be considered multithreaded.
Multiprocess applications use multiple processes to accomplish this same task. The most common implementation of this is the "forking" model, in which child processes are created by the parent process. These child processes execute the same code as the process that created them but share no memory or address space.
As an example, you start your Web server, and it waits for a request from a client. When a forking-model Web server (Apache versions 1.3.x) receives a request, it will create a duplicate of itself (a new process) and pass the request to that copy. If another request comes in, the same thing happens. Within each process several threads may be running. One could perform the DNS lookup on the client IP address for logging purposes, while another is finding the resource the client requested. A multithreaded Web server would not create additional processes; instead, it would create multiple threads to perform the same task.
Why use a multiprocess model? If one of the child processes for a Web server crashes, it doesn't affect the rest of the processes. Only one client is affected. If the Web server uses a multithreaded model instead and a thread dies, many clients can be affected, causing discontent among users. However, creating a process takes longer than creating a thread. So a multithreaded model Web server responds faster than does a multiprocess model server in most circumstances.
A compromise is a cross between the multiprocess and multithreaded models. A Web server application may start a specific number of processes and use them as a pool. Instead of creating new processes to service requests, the application will simply manage the processes and hand off requests to one of the available processes. In turn, each process can keep a pool of threads that it can use to handle each request. This model is perhaps the most efficient, because it lets the application focus solely on answering and passing requests to other processes. Each process can dole out specific tasks to existing threads. If one process crashes, it does not affect the entire application.
Multiprocess applications create child processes to increase performance, but most applications do not use a multiprocess model. Why?
One of the biggest disadvantages to threads is the possibility of deadlock--that is, two or more threads will stop executing while waiting for the same resource. While this can occur between processes, it is more likely to occur with threads, because the threads are attempting to manipulate the same resources that control the resource. Because each thread is waiting for the other to finish, execution of the entire process is at a standstill until the process is killed or restarted by external means.
Another disadvantage of threads is blocking. When a thread makes certain types of system calls, such as sockets and other I/O, the call will not return until it has finished or the system call is interrupted by a signal to the process. Unfortunately, if an error occurs during the call, it can take a long time for the call to return (if it does). During this time, the thread can execute no other instruction.
But the disadvantages of threads do not outweigh the cost in terms of memory and creation time for processes. For these reasons, most applications already use, or are moving toward using, a multithreading model. Some applications, especially in the Web server space, have joined the two models and are using a multiprocess model in which each process is also multithreaded. This model offers the advantages of both models while minimizing their disadvantages.