Feedback-Directed Barrier Optimization in a Strongly Isolated STM

Abstract

Speed improvements in today’s processors have largely been delivered in the form of multiple cores, increasing the importance of abstractions that ease parallel programming. Software transactional memory (STM) addresses many of the complications of concurrency by providing a simple and composable model for safe access to shared data structures. Software transactions extend a language with an atomic primitive that declares that the effects of a block of code should not be interleaved with actions executing concurrently on other threads. Adding barriers to shared memory accesses provides atomicity, consistency and isolation.Strongly isolated STMs preserve the safety properties of transactions for all memory operations in a program, not just those inside an atomic block. Isolation barriers are added to non-transactional loads and stores in such a system to prevent those accesses from observing or corrupting a partially completed transaction. Strong isolation is especially important when integrating transactions into an existing language and memory model. Isolation barriers have a prohibitive performance overhead, however, so most STM proposals have chosen not to provide strong isolation.In this paper we reduce the costs of strong isolation by customizing isolation barriers for their observed usage. The customized barriers provide accelerated execution by blocking threads whose accesses do not follow the expected pattern. We use hot swap to tighten or loosen the hypothesized pattern, while preserving strong isolation. We introduce a family of optimization hypotheses that balance verification cost against generality.We demonstrate the feasibility of dynamic barrier optimization by implementing it in a bytecode-rewriting Java STM. Feedback-directed customization reduces the overhead of strong isolation from 505% to 38% across 11 non-transactional benchmarks; persistent feedback data further reduces the overhead to 16%. Dynamic optimization accelerates a multi-threaded transactional benchmark by 31% for weakly-isolated execution and 34% for strongly-isolated execution.

Christos Kozyrakis
Christos Kozyrakis
Professor, EE & CS

Stanford University