Mike Hearn | fd98f1c | 2004-05-11 04:26:36 +0000 | [diff] [blame] | 1 | <chapter id="threading"> |
| 2 | <title>Multi-threading in Wine</title> |
| 3 | |
| 4 | <para> |
| 5 | This section will assume you understand the basics of multithreading. If not there are plenty of |
| 6 | good tutorials available on the net to get you started. |
| 7 | </para> |
| 8 | |
| 9 | <para> |
| 10 | Threading in Wine is somewhat complex due to several factors. The first is the advanced level of |
| 11 | multithreading support provided by Windows - there are far more threading related constructs available |
| 12 | in Win32 than the Linux equivalent (pthreads). The second is the need to be able to map Win32 threads |
| 13 | to native Linux threads which provides us with benefits like having the kernel schedule them without |
| 14 | our intervention. While it's possible to implement threading entirely without kernel support, doing so |
| 15 | is not desirable on most platforms that Wine runs on. |
| 16 | </para> |
| 17 | |
| 18 | <sect1> |
| 19 | <title> Threading support in Win32 </title> |
| 20 | |
| 21 | <para> |
Mike Hearn | b9c8671 | 2004-06-15 18:26:52 +0000 | [diff] [blame] | 22 | Win32 is an unusually thread friendly API. Not only is it entirely thread safe, but it provides |
| 23 | many different facilities for working with threads. These range from the basics such as starting |
Mike Hearn | fd98f1c | 2004-05-11 04:26:36 +0000 | [diff] [blame] | 24 | and stopping threads, to the extremely complex such as injecting threads into other processes and |
| 25 | COM inter-thread marshalling. |
| 26 | </para> |
| 27 | |
| 28 | <para> |
| 29 | One of the primary challenges of writing Wine code therefore is ensuring that all our DLLs are |
| 30 | thread safe, free of race conditions and so on. This isn't simple - don't be afraid to ask if |
| 31 | you aren't sure whether a piece of code is thread safe or not! |
| 32 | </para> |
| 33 | |
| 34 | <para> |
| 35 | Win32 provides many different ways you can make your code thread safe however the most common |
Mike Hearn | b9c8671 | 2004-06-15 18:26:52 +0000 | [diff] [blame] | 36 | are <emphasis>critical section</emphasis> and the <emphasis>interlocked functions</emphasis>. |
Mike Hearn | fd98f1c | 2004-05-11 04:26:36 +0000 | [diff] [blame] | 37 | Critical sections are a type of mutex designed to protect a geographic area of code. If you don't |
| 38 | want multiple threads running in a piece of code at once, you can protect them with calls to |
| 39 | EnterCriticalSection and LeaveCriticalSection. The first call to EnterCriticalSection by a thread |
| 40 | will lock the section and continue without stopping. If another thread calls it then it will block |
| 41 | until the original thread calls LeaveCriticalSection again. |
| 42 | </para> |
| 43 | |
| 44 | <para> |
| 45 | It is therefore vitally important that if you use critical sections to make some code thread-safe, |
| 46 | that you check every possible codepath out of the code to ensure that any held sections are left. |
| 47 | Code like this: |
| 48 | </para> |
| 49 | |
| 50 | <programlisting> if (res != ERROR_SUCCESS) return res; </programlisting> |
| 51 | |
| 52 | <para> |
| 53 | is extremely suspect in a function that also contains a call to EnterCriticalSection. Be careful. |
| 54 | </para> |
| 55 | |
| 56 | <para> |
| 57 | If a thread blocks while waiting for another thread to leave a critical section, you will |
| 58 | see an error from the RtlpWaitForCriticalSection function, along with a note of which |
| 59 | thread is holding the lock. This only appears after a certain timeout, normally a few |
| 60 | seconds. It's possible the thread holding the lock is just being really slow which is why |
| 61 | Wine won't terminate the app like a non-checked build of Windows would, but the most |
| 62 | common cause is that for some reason a thread forgot to call LeaveCriticalSection, or died |
| 63 | while holding the lock (perhaps because it was in turn waiting for another lock). This |
| 64 | doesn't just happen in Wine code: a deadlock while waiting for a critical section could |
| 65 | be due to a bug in the app triggered by a slight difference in the emulation. |
| 66 | </para> |
| 67 | |
| 68 | <para> |
| 69 | Another popular mechanism available is the use of functions like InterlockedIncrement and |
| 70 | InterlockedExchange. These make use of native CPU abilities to execute a single |
| 71 | instruction while ensuring any other processors on the system cannot access memory, and |
| 72 | allow you to do common operations like add/remove/check a variable in thread-safe code |
| 73 | without holding a mutex. These are useful for reference counting especially in |
| 74 | free-threaded (thread safe) COM objects. |
| 75 | </para> |
| 76 | |
| 77 | <para> |
| 78 | Finally, the usage of TLS slots are also popular. TLS stands for thread-local storage, and is |
| 79 | a set of slots scoped local to a thread which you can store pointers in. Look on MSDN for the |
| 80 | TlsAlloc function to learn more about the Win32 implementation of this. Essentially, the |
| 81 | contents of a given slot will be different in each thread, so you can use this to store data |
| 82 | that is only meaningful in the context of a single thread. On recent versions of Linux the |
| 83 | __thread keyword provides a convenient interface to this functionality - a more portable API |
| 84 | is exposed in the pthread library. However, these facilities is not used by Wine, rather, we |
| 85 | implement Win32 TLS entirely ourselves. |
| 86 | </para> |
| 87 | </sect1> |
| 88 | |
| 89 | <sect1> |
| 90 | <title> SysLevels </title> |
| 91 | |
| 92 | <para> |
| 93 | SysLevels are an undocumented Windows-internal thread-safety system. They are basically |
| 94 | critical sections which must be taken in a particular order. The mechanism is generic but |
| 95 | there are always three syslevels: level 1 is the Win16 mutex, level 2 is the USER mutex |
| 96 | and level 3 is the GDI mutex. |
| 97 | </para> |
| 98 | |
| 99 | <para> |
| 100 | When entering a syslevel, the code (in dlls/kernel/syslevel.c) will check that a |
| 101 | higher syslevel is not already held and produce an error if so. This is because it's not |
| 102 | legal to enter level 2 while holding level 3 - first, you must leave level 3. |
| 103 | </para> |
| 104 | |
| 105 | <para> |
| 106 | Throughout the code you may see calls to _ConfirmSysLevel() and _CheckNotSysLevel(). These |
| 107 | functions are essentially assertions about the syslevel states and can be used to check |
| 108 | that the rules have not been accidentally violated. In particular, _CheckNotSysLevel() |
| 109 | will break (probably into the debugger) if the check fails. If this happens the solution |
| 110 | is to get a backtrace and find out, by reading the source of the wine functions called |
| 111 | along the way, how Wine got into the invalid state. |
| 112 | </para> |
| 113 | |
| 114 | </sect1> |
| 115 | |
| 116 | <sect1> |
| 117 | <title> POSIX threading vs kernel threading </title> |
| 118 | |
| 119 | <para> |
| 120 | Wine runs in one of two modes: either pthreads (posix threading) or kthreads (kernel |
| 121 | threading). This section explains the differences between them. The one that is used is |
| 122 | automatically selected on startup by a small test program which then execs the correct |
| 123 | binary, either wine-kthread or wine-pthread. On NPTL-enabled systems pthreads will be |
| 124 | used, and on older non-NPTL systems kthreads is selected. |
| 125 | </para> |
| 126 | |
| 127 | <para> |
| 128 | Let's start with a bit of history. Back in the dark ages when Wines threading support was |
| 129 | first implemented a problem was faced - Windows had much more capable threading APIs than |
| 130 | Linux did. This presented a problem - Wine works either by reimplementing an API entirely |
| 131 | or by mapping it onto the underlying systems equivalent. How could Win32 threading be |
| 132 | implemented using a library which did not have all the neeed features? The answer, of |
| 133 | course, was that it couldn't be. |
| 134 | </para> |
| 135 | |
| 136 | <para> |
| 137 | On Linux the pthreads interface is used to start, stop and control threads. The pthreads |
| 138 | library in turn is based on top of so-called "kernel threads" which are created using the |
| 139 | clone(2) syscall. Pthreads provides a nicer (more portable) interface to this |
| 140 | functionality and also provides APIs for controlling mutexes. There is a |
| 141 | <ulink url="http://www.llnl.gov/computing/tutorials/pthreads/"> |
| 142 | good tutorial on pthreads </ulink> available if you want to learn more. |
| 143 | </para> |
| 144 | |
| 145 | <para> |
| 146 | As pthreads did not provide the necessary semantics to implement Win32 threading, the |
| 147 | decision was made to implement Win32 threading on top of the underlying kernel threads by |
| 148 | using syscalls like clone directly. This provided maximum flexibility and allowed a |
| 149 | correct implementation but caused some bad side effects. Most notably, all the userland |
| 150 | Linux APIs assumed that the user was utilising the pthreads library. Some only enabled |
| 151 | thread safety when they detected that pthreads was in use - this is true of glibc, for |
| 152 | instance. Worse, pthreads and pure kernel threads had strange interactions when run in |
| 153 | the same process yet some libraries used by Wine used pthreads internally. Throw in |
| 154 | source code porting using WineLib - where you have both UNIX and Win32 code in the same |
| 155 | process - and chaos was the result. |
| 156 | </para> |
| 157 | |
| 158 | <para> |
| 159 | The solution was simple yet ingenius: Wine would provide its own implementation of the pthread |
| 160 | library <emphasis>inside</emphasis> its own binary. Due to the semantics of ELF symbol |
| 161 | scoping, this would cause Wines own implementations to override any implementation loaded |
| 162 | later on (like the real libpthread.so). Therefore, any calls to the pthread APIs in |
| 163 | external libraries would be linked to Wines instead of the systems pthreads library, and |
| 164 | Wine implemented pthreads by using the standard Windows threading APIs it in turn |
| 165 | implemented itself. |
| 166 | </para> |
| 167 | |
| 168 | <para> |
| 169 | As a result, libraries that only became thread-safe in the presence of a loaded pthreads |
| 170 | implementation would now do so, and any external code that used pthreads would actually |
| 171 | end up creating Win32 threads that Wine was aware of and controlled. This worked quite |
| 172 | nicely for a long time, even though it required doing some extremely un-kosher things like |
| 173 | overriding internal libc structures and functions. That is, it worked until NPTL was |
| 174 | developed at which point the underlying thread implementation on Linux changed |
| 175 | dramatically. |
| 176 | </para> |
| 177 | |
| 178 | <para> |
| 179 | The fake pthread implementation can be found in loader/kthread.c, which is used to |
| 180 | produce to wine-kthread binary. In contrast, loader/pthread.c produces the wine-pthread |
| 181 | binary which is used on newer NPTL systems. |
| 182 | </para> |
| 183 | |
| 184 | <para> |
| 185 | NPTL is a new threading subsystem for Linux that hugely improves its performance and |
| 186 | flexibility. By allowing threads to become much more scalable and adding new pthread |
| 187 | APIs, NPTL made Linux competitive with Windows in the multi-threaded world. Unfortunately |
| 188 | it also broke many assumptions made by Wine (as well as other applications such as the |
| 189 | Sun JVM and RealPlayer) in the process. |
| 190 | </para> |
| 191 | |
| 192 | <para> |
| 193 | There was, however, some good news. NPTL made Linux threading powerful enough |
| 194 | that Win32 threads could now be implemented on top of pthreads like any other normal |
| 195 | application. There would no longer be problems with mixing win32-kthreads and pthreads |
| 196 | created by external libraries, and no need to override glibc internals. As you can see |
| 197 | from the relative sizes of the loader/kthread.c and loader/pthread.c files, the |
| 198 | difference in code complexity is considerable. NPTL also made several other semantic |
| 199 | changes to things such as signal delivery so changes were required in many different |
| 200 | places in Wine. |
| 201 | </para> |
| 202 | |
| 203 | <para> |
| 204 | On non-Linux systems the threading interface is typically not powerful enough to |
| 205 | replicate the semantics Win32 applications expect and so kthreads with the |
| 206 | pthread overrides are used. |
| 207 | </para> |
| 208 | </sect1> |
| 209 | |
| 210 | <sect1> |
| 211 | <title> The Win32 thread environment </title> |
| 212 | |
| 213 | <para> |
| 214 | All Win32 code, whether from a native EXE/DLL or in Wine itself, expects certain constructs to |
| 215 | be present in its environment. This section explores what those constructs are and how Wine |
| 216 | sets them up. The lack of this environment is one thing that makes it hard to use Wine code |
| 217 | directly from standard Linux applications - in order to interact with Win32 code a thread |
| 218 | must first be "adopted" by Wine. |
| 219 | </para> |
| 220 | |
| 221 | <para> |
| 222 | The first thing Win32 code requires is the <emphasis>TEB</emphasis> or "Thread Environment |
| 223 | Block". This is an internal (undocumented) Windows structure associated with every thread |
| 224 | which stores a variety of things such as TLS slots, a pointer to the threads message queue, |
| 225 | the last error code and so on. You can see the definition of the TEB in include/thread.h, or |
| 226 | at least what we know of it so far. Being internal and subject to change, the layout of the |
| 227 | TEB has had to be reverse engineered from scratch. |
| 228 | </para> |
| 229 | |
| 230 | <para> |
| 231 | A pointer to the TEB is stored in the %fs register and can be accessed using NtCurrentTeb() |
| 232 | from within Wine code. %fs actually stores a selector, and setting it therefore requires |
| 233 | modifying the processes local descriptor table (LDT) - the code to do this is in lib/wine/ldt.c. |
| 234 | </para> |
| 235 | |
| 236 | <para> |
| 237 | The TEB is required by nearly all Win32 code run in the Wine environment, as any wineserver |
| 238 | RPC will use it, which in turn implies that any code which could possibly block (for instance |
| 239 | by using a critical section) needs it. The TEB also holds the SEH exception handler chain as |
| 240 | the first element, so if when disassembling you see code like this: |
| 241 | </para> |
| 242 | |
| 243 | <programlisting> movl %esp, %fs:0 </programlisting> |
| 244 | |
| 245 | <para> |
| 246 | ... then you are seeing the program set up an SEH handler frame. All threads must have at |
| 247 | least one SEH entry, which normally points to the backstop handler which is ultimately |
| 248 | responsible for popping up the all-too-familiar "This program has performed an illegal |
| 249 | operation and will be terminated" message. On Wine we just drop straight into the debugger. |
| 250 | A full description of SEH is out of the scope of this section, however there are some good |
| 251 | articles in MSJ if you are interested. |
| 252 | </para> |
| 253 | |
| 254 | <para> |
| 255 | All Win32-aware threads must have a wineserver connection. Many different APIs |
| 256 | require the ability to communicate with the wineserver. In turn, the wineserver must be aware |
| 257 | of Win32 threads in order to be able to accurately report information to other parts of the |
| 258 | program and do things like route inter-thread messages, dispatch APCs (asynchronous procedure |
| 259 | calls) and so on. Therefore a part of thread initialization is initializing the thread |
| 260 | serverside. The result is not only correct information in the server, but a set of file |
| 261 | descriptors the thread can use to communicate with the server - the request fd, reply fd and |
| 262 | wait fd (used for blocking). |
| 263 | </para> |
| 264 | |
| 265 | </sect1> |
| 266 | </chapter> |