win32: Properly wait unil the thread finishes in dll unload
Without this we run into a problem. After 50ms we forcably kill
the pool thread. However, if the pool thread currently holds the
allocation lock, but doesn't get scheduled within 50ms, said lock
will never be released causing an infinite deadlock and preventing
the DLL from releasing. Further, in the unload scenario (as opposed
to process exit), we may actually want to let the thread pool finish
in case it's working on finishing an operation on another thread
(since the user may be expecting the answer).