[kernel][futex] Refactor futex state tracking.
Refactor the way that we track futex state in the kernel. Previously, the way that we tracked state was to have a FutexNode object which was held on the stack for each waiting thread. The FutexNode was an object which played two roles. When it was the head of a list of waiters, it maintained any per-futex-state in addition to maintaining the per-waiter state. When it was just another waiter in the queue, it had storage for the per-futex-state, but really was only holding on to the per-waiter state. Whenever the head of the waiters for a particular futex changed, bookkeeping code needed to transfer the per-futex-state from the old head to the new head. This approach was pretty neat as it resulted in lockless O(1) allocation for futex state which could not fail (baring a kernel stack overflow). Still, there were some downsides to the approach. In particular... ++ Part of the per-waiter state was a wait_queue_t; Basically ever futex waiting in a queue had their own dedicated wait_queue object. While not incorrect, this approach makes it difficult when we want priority inheritance to become involved. It is easier to use a single wait_queue_t for all of the futex waiters as this allows the scheduler code to become involved in the decision of who to release from a futex during a wake or a requeue operation. Also, it aligns the data structure used to hold waiting between futexes and the rest of the kernel. Now, as the scheduler evolves and changes are made to how waiters wait, a set of futex waiters is guaranteed to have the same behavior as any other set of waiting threads. ++ Because of the slightly tricky way in which the FutexNode storage was being held, custom container code was needed. Again, there is nothing inherently wrong with this, but the existing test code had no tests, and raised the readability bar for someone new to the code. ++ As the per-futex-state storage needs start to grow in order to implement PI, maintaining the code which moves the futex state from old-head to new-head starts to have new requirements which need to be met. Remembering to do this in all of the proper places can become burdensome and prone to failure. While refactoring the code to centralize the logic would be possible, the logic is not needed if the bookkeeping is centralized. So; enter this change. Instead allocating this state on the stack any time there is a waiter, we shift to the following approach. Start by recognizing that a futex only has state while it has waiters. Therefor, the absolute maximum number of futex state structures we need in a process limited to the number of threads in the system. When a thread is created, we can dynamically allocate a structure for tracking per-futex-state and contribute it to our process's FutexContext state pool. When a thread exits, it may remove a state structure from free pool and let it's reference go out of scope. In the meantime, code actually waiting on a futex can simply look up the futex state for the futex in question from the associative container (currently a hashtable backed by DLL buckets) which maps from futex id to futex state structure, or just grab a new one from the futex free pool if there are no current waiters for the futex in question. The free pool in this situation is a simple list, so our O(1) allocation property is preserved. We now need a lock in order to protect our free and active collections, but this can be the same lock which was already needed to protect the free/active state of a given futex's state, so no additional cost is being paid. Finally, because of the argument given above about the maximum number of active futex contexts in a process being the same as the active number of threads in the process, we are guaranteed to never be able to fail to allocate futex state. ZX-1798 #comment Refactor futex state to prepare for moving ownership down into the wait queue level. Tests: Build and overnight unit tests on QEMU, NUC and VIM2 Change-Id: I1451929e24a1bcb504be3f94ce858dbb6c7875f5
Showing
- zircon/kernel/include/kernel/wait.h 15 additions, 1 deletionzircon/kernel/include/kernel/wait.h
- zircon/kernel/kernel/wait.cpp 27 additions, 1 deletionzircon/kernel/kernel/wait.cpp
- zircon/kernel/object/BUILD.gn 0 additions, 1 deletionzircon/kernel/object/BUILD.gn
- zircon/kernel/object/futex_context.cpp 336 additions, 202 deletionszircon/kernel/object/futex_context.cpp
- zircon/kernel/object/futex_node.cpp 0 additions, 271 deletionszircon/kernel/object/futex_node.cpp
- zircon/kernel/object/include/object/futex_context.h 171 additions, 30 deletionszircon/kernel/object/include/object/futex_context.h
- zircon/kernel/object/include/object/futex_node.h 0 additions, 146 deletionszircon/kernel/object/include/object/futex_node.h
- zircon/kernel/object/include/object/process_dispatcher.h 1 addition, 1 deletionzircon/kernel/object/include/object/process_dispatcher.h
- zircon/kernel/object/include/object/thread_dispatcher.h 1 addition, 1 deletionzircon/kernel/object/include/object/thread_dispatcher.h
- zircon/kernel/object/thread_dispatcher.cpp 17 additions, 3 deletionszircon/kernel/object/thread_dispatcher.cpp
- zircon/kernel/syscalls/futex.cpp 11 additions, 11 deletionszircon/kernel/syscalls/futex.cpp
Loading
Please register or sign in to comment