CSC 5780 : Theory of Distributed Computing Instructor: Bogdan Chlebus Lecture 20 : Simulations among Types This lecture is about asynchronous shared-memory systems. Shared variables are of arbitrary types. There are some n processes that are prone to crashing. Our goal is to compare types with respect to the power of their functionality. Comparisons are in terms of possible simulations among types, in the sense that a simulated type is considered “weaker” and a simulating type is “stronger.” Simulations represent capabilities to express functionality of weaker types by stronger ones. The ultimate goal for a simulation is to implement an atomic object of a simulated type. We refer to shared registers/variables as shared objects and such shared registers/variables/objects as being concurrent. We may refer to a simulation as an implementation. 20.1 Types of shared objects The speciﬁcation of a shared register is called its type. The type of a shared object includes the following components. 1. A list oper1, oper2, ..., operk of operations. 2. A set of possible values to be returned in a response. 3. A set of possible states of processes. 4. A transition function that determines transitioning to next states and values to be returned. Unless stated otherwise, operations performed on objects are understood as occurring on the “medium” level of modeling executions, so that operation operj is performed by invocation invj, that includes also a list of parameters and a matching response respj. Lists of parameters passed to an operation may be empty, for a particular operation. Similarly, a response may not return any value, which then acts like an acknowledgement. When an object is in a state current state and operation operj has been invoked, the next state next state and the value returned value that is returned are determined as follows: transition (current state , operj) = (next state , returned value) . 2 Theory of Distributed Computing. Lecture 20 : Simulations among types Often states correspond to values, in the sense that the current state determines uniquely the value returned. For instance, when a read-write register stores a value then this value both determines the state of the object and the value to be returned by a read operation. If a value is represented by a state in this manner, then this value is said to be stored in the object, even when the object is not a read-write register. Next we review all types considered in this lecture, except for read-write registers. Test&Set: This type has the set {0,1} to represent both the states and the returned values. There are two operations test-and-set and reset. None of the operations passes any parameters, and reset does not return any value. Operation test-and-set returns the value stored in the object and sets it equal to 1. Operation reset returns an acknowledgement and sets the object to store 0. Counter: This type has the nonnegative integers as both the set of values and the set of states. The only operation fetch-and-increment returns the value stored in the object and modiﬁes the state by incrementing it by 1. The operation passes no parameters and returns the value originally held by the object. Compare&Swap: This type has some set of values to store, which depend on the application that uses objects of this type. The operation compare-and-swap passes two values as parameters and returns the original value stored in the variable. The operation compare-and-swap(R,old,new), for object R, replaces the original value by new if the original value is equal to old. Intuitively, we feel that Counter is more expressive than Test&Set, but Compare&Swap is even stronger than Counter. The property of being “at least as strong as” is formalized by the notion of an implementation. Implementations/simulations among types: Type S can n-implement type T if a system of n processes communicating by concurrent objects of type S and some auxiliary read-write registers can simulate an atomic object of type T. If this property holds for any n then we say that type T can be implemented from type S. Notice that an arbitrary number of objects of type S may be involved in an implementation. There is a natural method to develop implementations in which invoking an operation starts with allocation of a new record. One ﬁeld is an object, of a simulating type, that stores a pointer to thread in the record into a list. The number of such records, and hence of simulating objects as well, may depend on the number of processes, and for some implementations could even grow unbounded for any number of processes. Simulations in which the 20.2 Blocking and nonblocking implementations 3 number y of simulating objects depends on the number x of processes, but does not grow unbounded for a ﬁxed such a number x, is clearly superior to those that may need inﬁnitely many objects in an execution. Our deﬁnition of type may seem not to be general enough to capture some “reasonable” types of concurrent objects. For instance, consider the following shared object of an apparently really strong type called Load-Link&Store-Conditional. Load-Link&Store-Conditional: This type has a set of values, that can be stored in an object of this type, which depends on applications. The load-link operation copies the value stored in the object to a private variable of a process. A subsequent store-conditional(v) puts the value v as a new value stored by the object only if no other process has modiﬁed the object in the interim. The response to an invocation of store-conditional returns the conﬁrmation of either a failure or a success. It is assumed that a process invokes operations load-link and store-conditional alternatingly, starting with load-link, otherwise the behavior of the register may be unpredictable, in the sense that it may depend on implementation. What makes this type to appear especially strong is that the outcome of an operation on an object of type Load-Link&StoreConditional depends on its history, namely, what was the latest operation that modiﬁed the object. The reason we do not extend the deﬁnition of types to cover such apparently stronger objects like this one is that they can be implemented in terms of suﬃciently strong types, like Compare&Swap, that are already captured by our “oﬃcial” deﬁnition. The ultimate goal of simulations is to obtain atomic objects, but we are ready to consider less challenging objectives. Important weaker kinds of quality of implementations are introduced and discussed by way of examples in the following two sections. 20.2 Blocking and nonblocking implementations We show how to implement Test&Set from Counter and Counter from Compare&Swap. These simulations are relatively direct and simple but lack quality. 20.2.1 A blocking implementation We ﬁrst present a simple implementation of Test&Set from Counter. There is a Counter object C and a read-write register R, both initialized to zero. 4 Theory of Distributed Computing. Lecture 20 : Simulations among types A direct implementation of Test&Set from Counter: To implement reset, a simulating process p performs the following: repeat x ← fetch-and-increment(C) ; R← x + 2 ; y ← fetch-and-increment(C) ; until y = x + 1 ; acknowledge where x and y are private variables of p. Iterations continue producing a sequence z1,z2,z3,... of values, each returned by operation fetch-and-increment(C) invoked by p, until eventually zi+1 = zi + 1 holds hopefully for some i ≥ 1. To implement test-and-set, a simulating process p iterates the following loop: repeat y ← fetch-and-increment(C) ; read R and next set x ←R ; if y > x, then return 1 ; if y = x, then return 0 This loop is iterated until y ≥ x holds, which we hope will eventually happen. Any well-formed simulating execution of this implementation is linearizable. Showing this is left to the reader, see Exercise 2. Are there executions that are not well-formed? The answer is yes. To see this, consider the following scenario. There are just two processes p and q. Both are trying to reset the simulated object. The problem is that with a certain speciﬁc interleaving of events both processes can be starved and the system as a whole does not make progress. This may happen as follows. fetch-and-incrementp(C) returns 1 ; fetch-and-incrementq(C) returns 2 ; R←p 3 ; R←q 4 ; fetch-and-incrementp(C) returns 3 ; fetch-and-incrementq(C) returns 4 ; (next iteration of the repeat-loop begins) 20.3 Wait-free implementation 5 fetch-and-incrementp(C) returns 5 ; fetch-and-incrementq(C) returns 6 ; and this may continue forever. Deadlock and blocking implementations: Deadlock is a scenario in which all the processes are starved and the system as a whole does not make any progress. An implementation is called blocking if deadlock is possible. The implementation just considered is blocking. 20.2.2 A nonblocking implementation Nonblocking implementations: An implementation is nonblocking if deadlock never occurs. We show a very direct implementation of Counter from Compare&Swap. Let R be an object of type Compare&Swap. To simulate fetch-and-increment, a process executes the following pseudocode repeat X ← compare-and-swap(R,0,0) Y ← compare-and-swap(R,X,X + 1) until X = Y return X The implementation is non-blocking. The ﬁrst compare-and-swap is just reading, which does not modify the object. If a number of processes are concurrently trying to increment the object, their reads do not interfere with each other, and there is always some process that is ﬁrst to execute the second of its invocations of compare-and-swap. 20.3 Wait-free implementation Next we consider implementations with better properties: Wait-free implementations: Lockout is a scenario in which some process is starved, while possibly the system as a whole makes progress. 6 Theory of Distributed Computing. Lecture 20 : Simulations among types An implementation is called wait-free if lockout is not possible. An implementation is called k-bounded wait-free if the number k is an upper bound on the number of completed operations between an invocation and a response that holds for every process; this upper bound may depend on the number of processes in the system. An implementation is called bounded wait-free if it is k-bounded wait-free for some number k ≥ 0. The direct implementation of Counter from Compare&Swap given in Section 20.2.2 is not wait-free, because lockout is possible. An example of a wait-free implementation. We develop a wait-free implementation of Counter from Compare&Swap. There is an array Announce[0..n − 1] of shared read-write variables. An entry consists of two ﬁelds, each of them is a read-write register. One ﬁeld is a toggle bit named T Bit. The other ﬁeld is named Response, it is to store an integer value. All the ﬁelds are initialized to zeroes. A value processed by a Compare&Swap object R consists of three ﬁelds. The ﬁeld Count stores a nonnegative integer. The ﬁeld Process stores a number in [0,n − 1]. The ﬁeld Toggle is a logical array of n entries, numbered 0 through n−1. Let R-Null be the initial value of R, with all ﬁelds equal to zero. To invoke (the simulated) fetch-and-increment, process i ﬂips the bit Announce[i].T Bit. Now it is distinct from the corresponding bit R.Toggle[i]. Next process i iterates a loop. In each iteration, it tries to help some process to complete its invocation, unless the invocation of i has already been completed, which makes i simply return and quit. An iteration by i starts with i reading the contents of R: X ← compare-and-swap(R,R-Null,R-Null) . Next i checks to see if its own invocation has been completed. If Announce[i].T Bit = X.Toggle[i] then this the case. This makes i return Announce[i].Response and exit the loop. Otherwise, process i helps some other process, possibly itself, whose operation is still pending. This is done by considering all the processes in a round-robin way. This round-robin operation is globally synchronized by storing the currently considered process in ﬁeld R.Process. We say that a process k is active, according to the record X of values as stored in R, if Announce[k].T Bit 6= X.Toggle[k] . 20.3 Wait-free implementation 7 The process i tries to place a new value X0 in R. This value depends on whether X.Process is active or not. In the former case, the record X0 is determined as follows. X0.Process= (1 + X.Process) mod n X0.Count= 1 + X.Count X0.Toggle[X.Process] = Announce[X.Process], X0.Toggle[k] = X.Toggle[k] for k 6= X.Process. The action of i in this case is to ﬁrst set Announce[X.Process].Response ← X.Count and then execute compare-and-swap(R,X,X0). If X.Process is not active, then the record X0 is determined as follows. X0.Process= (1 + X.Process) mod n X0.Count= X.Count X0.Toggle= X.Toggle The action of i in this case is to execute only compare-and-swap(R,X,X0). There is no need for the process i to verify if its interaction with R resulted in a modiﬁcation of the value stored. If it was some other process that accomplished that, then it just incremented the ﬁeld R.Process and, if X.Process was active, then it used the same value X.Count to set Announce[X.Process].Response to. The resulting implementation is not only wait-free, it is actually n-bounded wait-free, see Exercise 7. An advantage of this simulation is that it uses only one object of the type Compare&Swap. A deﬁciency of this simulation is that an object of the type Compare&Swap handles values of a complex structure and a large size. A possible way to change this is to trade the complexity and size of the values for the number of objects of the type Compare&Swap. Exercise 8 guides in exploring this direction. Exercises 1. Suppose an implementation has a pseudocode without loops. Is it necessarily bounded wait-free? If you believe that the answer to this question is in the negative, then what about the question if it is necessarily wait-free? 2. Consider the blocking simulation of Test&Set from Counter given in Section 20.2.1. 8 Theory of Distributed Computing. Lecture 20 : Simulations among types (a) Show that any well-formed simulating execution is linearizable. (b) Replace the instruction R ← x+2 by R ← x+1. Show that there are well-formed executions of the resulting simulation that are not linearizable. (c) Show an example of a simulating execution by two processes in which one is executing reset and the other test-and-set and both are starved. (d) In the given simulation, a process executing a simulated reset invokes the operation fetch-and-add twice per each write to R. Propose a modiﬁed implementation, in which such a process invokes fetch-and-add once per each write to R. 3. Consider the question how to possibly implement Counter from Test&Set. (a) Develop a wait-free implementation for a system consisting of exactly two processes. (b) Show that there is no wait-free simulation for more than two processes. 4. Extend all the given implementations of type Counter to implementations of type Fetch&Add. 5. Let R be a Compare&Swap register initialized to 0. Suppose a single process interacts with R. Find responses produced by the following invocations: compare-and-swap(R,1,2) ; compare-and-swap(R,0,1) ; compare-and-swap(R,1,2) ; compare-and-swap(R,1,2) . 6. Describe a scenario in which a process is starved in the nonblocking implementation of Counter from Compare&Swap given in Section 20.2.2. 7. Consider the wait-free implementation of Counter from Compare&Swap given in Section 20.3. (a) Show that it is n-bounded wait-free. (b) Does the simulation work if Announce[k].T Bit is writable by k and readable by all? (c) Does the simulation work if Announce[k].Response is writable by k and readable by all? 8. Develop a wait-free implementation of Counter from Compare&Swap in which Compare&Swap objects manipulate only shared-memory addresses. 20.3 Wait-free implementation 9 Hint: An invocation of an operation starts with a process allocating a record. There are a number of ﬁelds in the record. One of them is to store a value to return in a response. Another ﬁeld is a Compare&Swap object that can contain a pointer to a record. The recent history of an execution is represented as a list of such records. A new record is advertised in the array Announce. An operation if performed by threading in the record at the head of the list. You may ﬁnd it convenient to use yet another array of records that have been successfully put into the list, and sequence numbers as ﬁelds in the records to identify the record inserted most recently. Develop a way to recycle records such that the overall number of records utilized by the processes is bounded. 9. Develop a bounded wait-free implementation of an object of type Load-Link&StoreConditional from Compare&Swap.