Resolving In-Doubt Transactions
Before starting, here is a PDF version of the document that I created years ago: Resolving In-Doubt Transactions
Distributed transactions perform DML on multiple databases which is a bit more complicated task because the database must coordinate the consitency in those seperate or even perhaps between different DBMSs (like Oracle – MS SQL). To ensure the transaction atomicity, Oracle implements a 2-phase commit mechanism through which the distributed transactions undergo some phases like prepare, commit, forget, etc. This phases constitute the hand-shake mechanism of the distributed transaciton.
However, sometimes things may go wrong (due to some network, system problem or even a reconfiguration of the underlying objects) and one of the phases fails while others are ok. Here, we say that the transaction becomes in-doubt. Normallly this problem should be handled by the RECO process itself, but in some cases this cannot br performed.
Why RECO cannot perform in some cases?
One of the databases involved in the distributed transaction might be unreachable (netowrk, system issues etc.) while the RECO was trying to resolve the problem (even when retrying to recover). (UNSTUCK)
The lookup tables of the “2-phase commit” mechanism might become inconsistent with the transaction itself. (STUCK)
1.HANDLING UNSTUCK TRANSACTIONS
Hopefully, there is no inconsistency between the lookup tables and the transaction and the following code resolves the problem.
To View the pending transactions:
‘96.22.163456’ id of the pending distributed transaction, which we will use in the following commands.
If the state is “prepared” the transaction can be forced to rollback or commit:
Note 1: If the command hangs, go to the “Handling Stuck DBA_2PC_PENDING” section.
Note 2: If the state of the transaction is “collecting” and you execute the above command, you may see an error like:
ERROR at line 1:
ORA-02058: no prepared transaction found with ID 96.22.163456
Then, you have to run:
2. HANDLING STUCK TRANSACTIONS
Our ultimate goal is not seeing the transaction in X$KTUXE table; and ensuring that the dictionary tables like PENDING_TRANS$ to be consistent with this information.
Stuck transactions can be examined under the below conditions:
2.1. Condition 1: DBA_2PC_PENDING view have entries about our transaction but there is no transaction in reality
The condition is that; when we issue select to the dictionary views like the DBA_2PC_PENDING, PENDING_TRANS$, etc. we see the transaction, but the transaction does not exist in X$KTUXE view.
The actual transaction entry view is X$KTUXE ([K]ernel [T]ransaction [U]ndo Transa[X]tion [E]ntry) where the columns correspond to the following sections of the transaction id:
Therefore, the condition1 holds when DBA_2PC_PENDING has the entry but X$KTUXE does not.
Solution 1 to Condition 1: If the state of the transaction (in DBA_2PC_PENDING) is committed, rollback forced or commit forced then it can be cleaned by:
Solution 2 to Condition 1: If the state of the transaction is prepared, we have to clean manually as follows:
2.2. Condition 2: DBA_2PC_PENDING view does NOT have entries about our transaction but there IS A transaction.
This is something like a orphan transaction that the dictionary is not aware of.
Trying to force commit or rollback this transaction may result in error like below, since the dictionary is not aware:
Solution to Condition 2: What we need to do at this point is; recovering our transaction from being an orphan by inserting some dummy records into dictionay tables (so the views…) and then force a rollback or commit: You do not have to change the parameters in the insert command other than the transaction id.
Now, we should be able to rollback or commit.
Lastly, we remove the dummy entry from the dictionary:
Check to see whether the transaction has gone:
2.3. Condition 3: DBA_2PC_PENDING has entry and there is a transaction but COMMIT or ROLLBACK HANGS!
In the situation, where COMMIT FORCE or ROLLBACK FORCE hangs,
Trying to purge the transaction will give an error like:
Solution to Condition 3: The solution is the combination of Cond1 and Cond2:
First, delete the dictionary entries:
Then, insert dummy record, force commit and finally purge the transaction: