From: 05.04.1997 23:50 Subject: 2PC & 3PC - Long and possibly duplicateTo: interbase@esunix1.emporia.edu My apologies if this appeared when I sent it yesterday. I didn't get a copy and thought the list server might have been feeling peckish. At 03:09 AM 4/4/97 +0200, Soeren Soerensen wrote: >> From: Bernard Wheeler >> 2PC is shorthand for two phase commit. Two phase commit is >> a mechanism used for multi-database transactions which >> maintains data integrity across those databases. > >2PC rely on one "master" and "slave(s)". In systems with several databases >a dying "master" (connection breaking down during the phases etc.), would >keep the "slaves" from being properly updated. 3PC (three phase commit) >handles this problem by allowing a "slave" to become the new "master" and >makes sure the remaining "slaves" are updated. Well, actually, there are more questions here than appear. InterBase's two phase commit is not organized as master/slave. Every participant has a log of every other participant, so any surviving database has the information necessary to locate the other participants and become the "master" for recovery. A word to the naive here, then you're all excused. n-Phase commits are a synchronization mechanism. A program asks all the transactions in its application to enter a state (Prepare, for example) then waits until they have all confirmed that they are in that state before asking them to enter the next state. No transaction can enter the second state until all have completed the first. Two phase commits have three states: Unprepared - must rollback Prepared - can rollback or commit Committed - can't rollback Three-phase commit adds a state which I'll call 'Set' between Prepared and Committed. The real advantage of three phase commit over two phase is that three phase gives "surviving" databases enough information so they can decided whether to commit or rollback before the other databases become available again. That's a very interesting academic issue. Most of the InterBase users had applications that required all the databases, so no processing could continue until all databases were back on line. Automating recovery for a two phase commit was a hairy enough problem that we put it off for three releases. Recover required human (some might say superhuman) intervention to read the logs and decided how to resolve transactions that failed during a two phase commit. Obviously, that human could decide to commit/rollback just the survivors and continue, accepting the risk that the unavailable transaction won't be in the same state. For those, who like me, need to work out on their fingers why survivors can decided to commit or rollback if they're using a three phase commit but a two phase commit requires all participant before it can be recovered reliably, let me try these tables. Transactions are labelled Y and Z. The phases are U[nprepared], P[repare], S[et], C[ommit], and ?. 'Set' is the term I'm using for the middle phase of three phase. '?' indicates that the transaction in question isn't answering - it's not a survivor. 'Unprepared' means that the transaction didn't start the commit sequence; it has rolled back. Two phase commit: Y | Z ------- U | U - nothing has happened, both rollback automatically. U | P - Y rolls back automatically, Z must roll back. P | P - Both prepared and should (probably) commit, but can roll back. P | C - Y must commit. ? | P - If Y is actually unprepared, Z must rollback. If Y is actually committed, Z must commit. Z must wait for Y -> U | C - This condition can NOT occur. Note that neither is master or slave. Obviously, the combinations U | ? and C |? don't require any handling - the transaction's fate is already determined. Three phase commit: Y | Z ------- U | U - nothing has happened, both rollback automatically. U | P - Y rolls back automatically, Z must roll back. P | P - All participants are alive and prepared, so they may commit ? | P - Y must be either P, U, or S so Z can rollback without waiting P | S - All participants are alive and at least prepared, so they may commit ? | S - Y must be either P, S, or C, so Z can commit without waiting S | C - Both must commit. -> U | S - This condition can NOT occur. -> P | C - This condition can NOT occur. -> U | C - This condition can NOT occur. Add another transaction X, and eliminate the boring cases. X | Y | Z ------- ? | U | P - X may be U or P. Y and Z can rollback without waiting ? | P | P - X may be U, P, or S. Y and Z can rollback without waiting. ? | P | S - X may be P, S, or C. Y and Z can commit without waiting. ? | ? | P - X or Y could be U, so Z must rollback ? | ? | S - X or Y could be C, so Z must commit -> U | S - This condition can NOT occur. -> P | C - This condition can NOT occur. -> U | C - This condition can NOT occur. In (approximately) English. If any of the survivors is Set, then all the others must be at least Prepared, so the survivors can safely commit. If none of the survivors is Set, then none of the others can have committed, so the survivors can safely rollback. Anyone who has read this far can get a free ice-cream and a beach pass by showing up in Manchester, Mass. USA anytime after the snow melts. Anybody who can come up with a practical* example of database usage where a three-phase commit is important gets a lobster dinner after the day at the beach. *IMHO - decision of the judge is final. Ann "`But he ate as many as he could get,' said Tweedledum."