The impossibility of distributed consensus is a a significant fundamental result of distributed consensus in an asynchronous model. It simply states there is no solution to reach consensus in a asynchronous distributed system. Since a consensus protocol plays an important role to provide synchronization in many applications, the impossibility triggers a large amount of following work to render the impossibility very unlikely by adapting asynchronous models. This article is intended to provide essential interpretation of the proof and assume readers are familiar with the terminology in distributed systems.
- asynchronous no time bound on process execution of processing a message such that no differentiation between process crash and being very slow
- model of assumptions
- each process has an variable with initial value , a decision state plus internal process states such as the program counter.
- the decision state is initially undecided as and eventually decided provided the process does not crash
- a process executes on receiving a message and sending one or more messages of its variable to to other processes. Broadcast is supported.
- messages sent are eventually delivered but may be out of order
- at most one process is allowed to crash and no executions after crash
- consensus protocol
- configuration the union of the states of all the processes in the system
- event a possible empty message is delivered to a process that executes and changes its states
- schedule a sequence of events
- step an action of processing an event that leads the system from one configuration to another
- run a sequence of steps
- admissible run
- at most one process can fail
- messages are eventually delivered
- deciding run an admissible run that some process decides
- partially correct
- no accessible configuration has more than one decision value
- each decision value must be reachable from some accessible configuration
- totally correct despite one faulty process, a consensus protocol is partially correct and every admissible run is a deciding run
Essential Interpretation of the FLP impossibility
- Commutativity of schedule
Starting from a configuration, two sequences of steps applied to disjoint sets of processes in a different order lead to the same configuration since there is no overlapped changes to the state of a particular process.
- There exists a bivalent initial configuration
Assume no such bivalent configuration and all the configurations are either 0-valent or 1-valent. Let a pair of configurations be adjacent if both differ only in one step change of the state, say one bit, of a particular process. Then enumerate and link all the configurations if they are adjacent. Without loss of generality, there must exist a pair of adjacent 0-valent and 1-valent configurations, and . Let the one step state change be applicable to a process p which takes no steps after a sequence of steps starting from and respectively. Since may crash or be arbitrarily slow, the protocol has to eventually decide a value. At this point, the resulting configuration from C0 and C1 must be the same because the only difference is the state of p. Whether the protocol decides 0 or 1, both imply either or is bivalent and can be the initial configuration. - Starting from a bivalent configuration, there exists an admissible run leading to another bivalent configuration, i.e. not all admissible run are deciding.
Starting from a bivalent configuration , let be the set of configurations reachable from C without applying an delayed event . Let be the set of configurations reachable from with event applied. Assume has no bivalent configurations. Let be an i-valent configuration in since is bivalent such that configuration in can be reached by applying to . Alternatively, if is applied in reaching , then there exists a configuration in such that is reachable from. Both and has to be univalent. Without loss of generality, there exist a pair adjacent configurations and which are univalent in such that different values are eventually decided. Let be the event taking to . There are two cases to consider.
3.1)
Let and . Then is reachable from by applying e’, contradicting the fact that and are univalent from and .
3.2)
Let be a configuration reachable through a sequence of steps without and such that a 0-valent configuration is reachable by applying the same sequence of steps to and to . Similarly, another 1-valent configuration E1 is also reachable by applying the same sequence of steps to and to . This implies is bivalent and reachable from univalent , leading to a contradiction. Therefore has a bivalent configuration.
Following the above arguments, it is implied there exists an admissible non-deciding run such that starting from a bivalent configuration, a consensus protocol reaches another bivalent configuration infinitely often. Hence the termination property does not hold, proving the impossibility of distributed consensus in an asynchronous setting.
No comments:
Post a Comment