. .
. . .
. .
.

Multithreaded Programming - Part 3.

Kettle

Oh dear, what now! He starts part 2 with a discussion about seashells, and part 3 with a kettle!

In part 1 of the tutorial, whilst converting the original serial program to a multithreaded program I added a call to the Sleep() function. I didn't say why I'd done that, now, we'll work through that oversight.


    Sleep(10000);

If I take the call out and run the program, on my system I got the following output.

Threads4

You may get something different depending on your system, and how busy it is. Clearly, something has gone very wrong, but what?

The answer lies in the threading model used by Windows. When the programs main() function exits, the process ends, thus any threads that were started by the program will terminate at that point. Readers that are coming to Windows multithreading from other backgrounds, *NIX, POSIX or pThreads for example, where this behaviour is not usually the case, are frequently floored by this simple difference.

The call to Sleep() was added to give the threads we spun enough time to run to completion before the main() function exited. This is obviously a hopeless bodge! In this test program, it works, but in real world applications, you may not know how long a thread will take. It may be processing a database query on a server which is sometimes busy, sometimes not, or a network connection with an unknown host - no way to know how long these things will take. We have to know when the threads are actually finished.

We need 2 things. We need a way of identifying the threads, and a way to tell if they are finished or not. So, lets start with a way of identifying the threads.

----------

When we created the threads back in part 1, we called _beginthread() and ignored any return value it might give us. Actually, the function returns useful information, in particular, (assuming the call works), it returns the created threads HANDLE.

If you want to use a pen, you pick it up and use it via it's handle, if you held it the other way round it wouldn't write. If you are boiling the kettle, chances are again, you'll use it via it's handle to avoid getting burnt. The handle is what you use when you want to use something.

Similaly, a HANDLE is what Windows uses to identify objects of certain types. As you program Windows, you'll come across many things that are used via a HANDLE. Some people think of a HANDLE as a kind of pointer, others as an internal name, use whatever metaphor you like, but don't worry too much about it, HANDLE's, are simply HANDLE's, and are returned by some functions, and used by others. You rarely do anything with HANDLE's other then obtain them and pass them, you don't often need to manipulate them.

HANDLE, like CRITICAL_SECTION that we used in part 2, is a type accessible to your program when you've included <windows.h>, and like CRITICAL_SECTION you can declare variables of type HANDLE. Here is the program modified to return and store the thread HANDLE's.

int main()
{
    HANDLE hThreads[2];

    InitializeCriticalSection(&Section);

    hThreads[0] = (HANDLE)_beginthread(Func1,
                                       0,
                                       NULL);

    hThreads[1] = (HANDLE)_beginthread(Func2,
                                       0,
                                       NULL);

    Sleep(10000);

    DeleteCriticalSection(&Section);

    cout << "Main exit" << endl;

    return 0;
}

First thing to notice is that the declaration of the HANDLE's is contained within the main() function, not globally. Since it is the main() function that will be doing the waiting, it is only relevent for main() to see the HANDLE's.

Second, I stored the 2 HANDLE's in an array, rather than as 2 individual HANDLE variables. I could have done it the other way, but you'll see why I did this shortly.

Thirdly, notice I had to convert the return value of _beginthread() to a HANDLE by casting it. This is because _beginthread() is specified as returning an unsigned int or unsigned 64bit int, depending on the platform. Without the cast, the normal rules of C++ would prevent the assignment.

Finally, I did not check to see if any errors occurred. In a real world program, you should, of course, do this. _beginthread() returns -1, if it encounters an error. In small programs like this, it rarely fails, but in a project with hundreds of large complicated threads, it is possible to run out of system resources, memory for example.

----------

On the surface, that doesn't seem to have got us anywhere, but some handles are smarter then others!

If I pick up a pen and start writing with it, there is no way the pen can tell when I have finished my letter, (remember when we used to use pens and actually write letters?!). Now, lets go back to that kettle. Thats not the same at all. On the handle of my kettle, and most others I have seen in recent years, there is a switch of some kind that clicks "off" when it is finished. Indeed, on mine, the switch has a lamp in it which also goes off when it is finished. The handle on the kettle, is a bit smarter - it is capable of signalling it's completion.

I said above that many Windows objects also have handles. Some of these objects also have smart handles, they too, can signal completion. Thread HANDLE's fall into this category. To get rid of that Sleep(), it is simply necessary to cause your main() to wait for the threads to signal their completion.

The function we will use to do this is WaitForMultipleObjects(). The function takes 4 parameters. Here is the main() function again, modified to use this routine.

int main()
{
    HANDLE hThreads[2];

    InitializeCriticalSection(&Section);

    hThreads[0] = (HANDLE)_beginthread(Func1,
                                       0,
                                       NULL);

    hThreads[1] = (HANDLE)_beginthread(Func2,
                                       0,
                                       NULL);

    WaitForMultipleObjects(2,
                           hThreads,
                           TRUE,
                           INFINITE);

    DeleteCriticalSection(&Section);

    cout << "Main exit" << endl;

    return 0;
}

The first 2 parameters refer to the array we created earlier, which is why I did it as an array rather than seperate variables. This function expects the HANDLE's to be passed as an array. The first parameter is a count of how many HANDLE's there are in the array, and the second is the array itself.

The third parameter is a boolean. It tells the function how to wait. If we want to wait until all the HANDLE's are signalled complete, then we set it to TRUE, (this is what we want in this case). If the parameter is set FALSE, then the wait returns as soon as any of the HANDLE's signal.

The final parameter allows us to tell the function how long to wait. If we say INFINITE, it will wait until the threads signal, regardless of how long it takes. It is possible to put an integer value here which is a time out value in milliseconds. If the threads do not complete within the specified time, the function returns anyway, test the return from the routine, if a timeout occurs, the function returns the constant WAIT_TIMEOUT.

The output looks the same as before when we used the Sleep(), but if you compile both, you'll see the new version runs much faster. Note, no changes were necessary to the thread functions to acheive this.

----------

We have now taken a single threaded program and turned it into a multithreaded version of itself. I hope you'll agree, it was not that difficult.

Multithreading, in essence, isn't difficult. What is required though, is thought and planning. A lot of problems with synchronisation for example, are caused when people try to impose multithreading on a complicated program that was not designed that way. Various bodges are introduced, and hard to find errors and lockups appear, often, only sometimes. Multithreaded programs should be designed to be multithreaded. The synchronisation strategy should be part of the design.

----------

The final point I'd like to make about the program we have developed here is with regard to performance. A common failing is to assume that multithreading an application will make it run faster.

In the past, to a certain extent, that was true. When a thread wanted to output to a device such as a disk, the thread was suspended will the disk carried out the I/O. This was often a substantial period of time as well because the disks were so slow.

Now, most I/O requests for example, from a thread are not carried out straight away, they are cached, which means the system has queued them for execution at a later, more convenient more efficient time.

When a thread is pre-empted, frequently, it will be necessary to complete or flush all cached operations. The partially decoded next instructions n the processors pipelines are discarded. The threads registers and stack have to be saved. All this happens before the next threads registers and stack can be restored, it's instructions can start to be decoded, and it can run. This happens everytime a thread is pre-empted. This adds up to a substantial overhead.

If you carefully time the original serial program, and the multithreaded version we've ended up with, you'll find that the multithreaded program runs slower then the original because of all of this overhead. The point here, is that certain types of application are not suited to multithreading.

Tasks that are CPU bound, running calculations etc., are generally not well suited to extensive multithreading. Where it is applicable is when you have a single thread doing lots of calculations and another handling the programs user interface. Splitting lengthy calculations into several shorter parallel operations generally hits performance.

Tasks which run a lot of network traffic using blocking I/O, tasks processing lengthy database enquiries on variably loaded servers, remote or local, generally, tasks which often "sit and wait" for something else to happen, can get a great performance boost by correct use of multithreading.

Multithreading is about design - it takes no prisoners!

----------

At the outset, I said this was a very brief introduction to this vast subject. I urge everyone to read the documentation, help and MSDN to find out more about the routines we've used, the concepts and the pitfalls. As soon as time allows, I'll put some of my more advanced multithreading material back online.

Now, it would seem, that my kettle is signalling the fact that it is time for tea.


.
. .
.
Previous Index Next Page
Site Home
.
. .
Copyright © adrianxw, 1997 - 2004.