Program 2 - Multi-threaded Programming

Out: Fri 03/26/2004
Due: Mon 04/12/2004, 11pm

Multi-Threaded Server

Download: server.tar (this is a tar file.  Use the command "tar xf server.tar" to extract.  You should end up with a directory named server in your current working directory after the extraction.)

Now, let's build a multi-threaded server so that you learn/remember how to write multi-threaded code. Consider an Internet service (such as a web server) that reads clients' requests (http), does some processing (reads the files being requested), and then responds to the requests (send back the requested files). One of the challenges in building Internet services is dealing with concurrency. In particular, many independent clients can simultaneously send requests to a given server.

A very simple (too simple, in fact) design strategy is to build the server as a single-threaded program. The server's single thread waits until a new request arrives, reads the request, processes the request, and then writes a response back to the originating client. In the meantime, if another request arrives while the server is processing the first, that other request is ignored until the first has been dealt with fully:

A more sophisticated strategy would be to build the server as a multithreaded program. When a task arrives at the server, a thread is dispatched to handle it. If multiple tasks arrive concurrently, then multiple server threads will execute concurrently. This allows I/O activities (e.g., reads/writes to the network and disk) to be overlapped with computation required to process clients' requests.

In this part, you will turn a single-threaded server provided by us into a multi-threaded server by building a "thread pool" and integrating it into the server. A thread pool is an object that contains a fixed number of threads and supports two operations: dispatch, which causes one thread from the pool to wake up and enter a specified function, and destroy, which kills off all of the threads in the pool and cleans up any memory associated with the pool.

Build a thread pool in C

Design, implement, and test an abstraction called a thread pool. Your thread pool implementation should consist of two files: threadpool.h and threadpool.c. In the server.tgz archive, we've provided you with skeleton implementations of these files. You must not modify threadpool.h in any way. To build your thread pool, you should only add to the threadpool.c file that we've given you.

As you should see from threadpool.h, your thread pool implementation must support the following three operations:

threadpool create_threadpool(int num_threads_in_pool);

void dispatch(threadpool from_me, dispatch_fn dispatch_to_here, void *arg);

void destroy_threadpool(threadpool destroyme);

The create_threadpool() function should create and return a threadpool. The threadpool must have exactly num_threads_in_pool threads inside of it. If the threadpool can't be created for some reason, this function should return NULL.

The dispatch() function takes a threadpool as an argument, as well as a function pointer and a (void *) argument to pass into the function. If there are available threads in the threadpool, dispatch( ) will cause exactly one thread in the pool to wake up and invoke the supplied function with the supplied argument. (Once the function call returns, the dispatched thread will re-enter the thread pool.) If there are no available threads in the pool, i.e. they have all been dispatched, then dispatch() must block until one becomes available by returning from its previous dispatch.

In either case, as soon as a thread has been dispatched, dispatch( ) must return.

The destroy_threadpool() function will cause all of the threads in the threadpool to commit suicide, after which any memory allocated for the threadpool should be cleaned up.

There should be NO BUSYWAITING in any of your code.

Use your thread pool to turn the single-threaded server into a multithreaded server.

In the server.tgz archive, we have provided you with a working implementation of a single-threaded network server. The server source code is in the file called server.c. The server is designed to do the following:

Create a "listening socket" on a network port specified as a command-line argument to the server. Because the server has created a listening socket, clients can now open connections to the server and send it data. In fact, multiple clients can simultaneously open multiple connections to the server. As explained in the background at the top of this web page, the single-threaded server doesn't handle multiple connections in parallel, but just works on them one at a time.

Perpetually loop, doing the following:

Accept a new connection from a client. (If there are multiple connections that have arrived, only one will be accepted.)

Read data from the new connection. (Our server reads 10 bytes.)

Process the request. Our server simply does some mindless computation on the 10 bytes. (How much computation is does can be altered by changing the NUM_LOOPS constant in the server source code.)

Write a response back to the client. (Our server writes 10 bytes.)

Close the connection to the client.

What you must do is change the server to work in the following way:

Create a thread pool. (How big the pool is should be specified as a command-line argument to your server.)

Create a listening socket.

Perpetually loop, doing the following:

Accept a new connection from a client.

Dispatch the connection to a thread from the thread pool. (Yes, if the thread pool is empty, the server's main thread will block in the dispatch until a thread becomes available.)

The dispatch function should do the following:

Read data from the dispatched connection.

Process the request.

Write a response back to the client.

Close the connection to the client.

Thus, each time a new connection arrives at the server, the main thread dispatches a thread from the threadpool to handle the connection.

We've also provided you with an implementation of a network client. The client is single-threaded: it loops forever, and in each loop iteration, it opens a connection to the server, writes a request, reads a response, and closes the connection. Therefore, to fully test your multithreaded server, you will need to run multiple clients in parallel.

To launch the single-threaded server, give the following command:

./server 4324
Here, 4324 is the "port" that the server listens to.

To launch a client, give the following command (in another window):

./client servername 4324
Here, servername is the IP address or hostname of the workstation that the server is running on.

To run multiple clients simultaneously, you could issue the following commands:

bash$ ./client localhost 4324 &
bash$ ./client localhost 4324 &
bash$ ./client localhost 4324 &
etc..

To fully test your server out, you'll need to launch at least as many clients as there are threads in your threadpool.

Measure the performance of your multithreaded server

Keep in mind that you may need SEVERAL quite machines, that is, machines that noone else is logged into to produce accurate results. So, don't wait until the last day, when it will be impossible to find quiet machines.

Now that you have a working multithreaded server, it's time to put it through its paces. You will measure the throughput of your server as a function of the number of threads in the threadpool, and the amount of computation performed in the server. Throughput is simply defined as the number of tasks per second that the server can handle.

To change how much computation the server performs, just alter the value of the NUM_LOOPS constant in the server source code. Higher numbers mean more computation. In fact, I suggest you make this another command-line argument to your server.

What you need to do is measure the throughput of the server in the following 36 conditions:

# threads in pool = 1 and NUM_LOOPS=1, 100, 1000, 10000, 100000, 500000
# threads in pool = 2 and NUM_LOOPS=1, 100, 1000, 10000, 100000, 500000
# threads in pool = 4 and NUM_LOOPS=1, 100, 1000, 10000, 100000, 500000
# threads in pool = 8 and NUM_LOOPS=1, 100, 1000, 10000, 100000, 500000
# threads in pool = 16 and NUM_LOOPS=1, 100, 1000, 10000, 100000, 500000
# threads in pool = 32 and NUM_LOOPS=1, 100, 1000, 10000, 100000, 500000


Using your favorite graphing package (such as Microsoft Excel), plot a graph that looks something like:

In other words, your graph should plot the throughput of your server as a function of the number of threads in the thread pool. You should plot a separate line for each different value of the NUM_LOOPS variable. You should play with logarithmic-scale axis in order to get your graph to reveal as much information as possible.

What and How to Turn In
Again, submit your completed assignment using Handin

You should submit a gzipped tar file. This file should untarred into the following:

Your writeup should include: