Wednesday, September 16, 2009

Transaction

1. ACID Properties

To preserve the integrity of data, the database system must ensure the following properties of the transactions:

Atomicity : Either all operations of the transaction are properly reflected in the database or none are.

Consistency : Execution of a transaction in isolation preserves the consistency of the database.

Isolation : Although multiple transactions may execute concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden from other concurrently executing transactions. That is, for every pair of transactions Ti and Tj , it appears to Ti that either Tj finished execution before Ti started or Tj started execution after Ti finished (even though they may be executing concurrently -- their operations are interleaved)

Durability : After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures.

These properties are often referred to as the ACID properties.

Suppose X is a data item in database. The following are two operations to access item X:

--read(X) : transfer the data item X from the database to a local buffer belonging to the transaction that executed the read operation

--write(X) : transfer the data item X from the local buffer of the transaction that executed the write back to the database

Example: (Fund transfer) Transaction to transfer $50 from account A to account B:

read(A)
A := A - 50
write(A)
read(A)
B := B + 50
write(B)

consistency: the sum of balances of account A and account B is unchanged by the execution of the transaction. Note that the database is temporarily inconsistent after step 3 but before step 6.

atomicity : two updates are performed at steps 3 and 6. If the transaction fails after step 3 and before step 6, the system should ensure that the updates are not reflected in the database, else an inconsistency will result.

durability : once the transaction has completed (i.e., the transfer of the $50 has taken place), the updates to the database by the transaction must persist despite failures

isolation : if between steps 3 and 6, another transaction is allowed to access the partially updated database, it will see an inconsistent database (the sum of balances of account A and account B will be less than it should be). The database system should ensure that this inconsistent database state be not seen by another transaction.

This can be trivially achieved by executing transactions serially, that is, one after the other. However, running multiple transactions concurrently increase the throughput.

2. Transaction State

A transaction must be in one of the following states (using the transaction model in book):
Active :
the initial state; the transaction stays in this state while it is executing
Partially committed : after the final statement has been executed
Failed : normal execution can no longer proceed
Aborted: after the transaction has been rolled back and the database restored to its state prior to the start of the transaction.

Two options after it has been aborted:

-- restart the transaction, only if no internal logical error

-- kill the transaction

Committed, after successful completion

3. Concurrent Executions


Multiple transactions are allowed to run concurrently in the system. Advantages are:

increased processor and disk utilization, leading to better transaction throughput: one transaction can be using the CPU while another is reading from or writing to the disk

reduced average response time for transactions : short transactions need not wait behind long ones

concurrency control schemes -- mechanisms to control the interaction among the concurrent transactions in order to prevent them from destroying the consistency of the database

4. Schedules

· A schedule of a set of transactions is an ordering of the instructions of the transactions such that

¨ it consists of all instructions of each transaction of this set

¨ it preserves the order in which the instructions appear in each individual transaction

Example: (a) Let T1 transfer $50 from account A to account B, and T2 transfer 10% of the balance from account A to account B. The following schedule (schedule I) is a serial schedule, in which T1 is followed by T2 .




(b) Let T1 and T2 be the transactions defined previously. The following schedule (schedule II) is not a serial schedule, but it is equivalent to schedule I


Note that in both schedules I and II, the sum of the balances of A and B is preserved.


(c) The following concurrent schedule (schedule III) does not preserve the value of the sum of the balances of A and B.


5. Serializability

· Basic assumption: each transaction preserves database consistency

· Thus, serial execution of a set of transactions preserves database consistency

· A (possible concurrent) schedule is serializable if it is equivalent to a serial schedule. Different forms of schedule equivalence give rise to the notions of

(i) conflict serializability

(ii) view serializability


· To focus on the main ideas, in our discussions, we ignore operations other than read and write instructions, and assume that transactions may perform arbitrary computations on data in local buffers in between reads and writes. That is, the simplified schedules consist of only read and write instructions.



Conflict Serializability

· Instructions Ii and Ij of transactions Ti and Tj respectively conflict if and only if there exists some item Q accessed by both Instructions Ii and Ij and at least one of these instructions wrote Q.

· (i) Ii = read(Q) and Ij = read(Q) : Ii and Ij don't conflict

(ii) Ii = read(Q) and Ij = write(Q) : Ii and Ij conflict

(iii) Ii = write(Q) and Ij = read(Q) : Ii and Ij conflict

(iv) Ii = write(Q) and Ij = write(Q) : Ii and Ij conflict

· Intuitively, a conflict between Ii and Ij forces a temporal order between them. If Ii and Ij are consecutive in a schedule and they do not conflict, their results would remain the same even if they had been interchanged in the schedule

· If a schedule S can be transformed into a schedule S' by a series of swaps of non-conflicting instructions, we say that S and S' are conflict equivalent.

· We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule.

· Example of a schedule that is not conflict serializable:

T3 T4

read(Q)

write(Q)

write(Q)















We are unable to swap instructions in the above schedule to obtain either the serial schedule
or the serial schedule .

· Schedule III below can be transformed into Schedule I , a serial schedule where T1 is followed by T2 , by a series of swaps of non-conflicting instructions. Therefore, schedule III is conflict serializable.


ead(B)

View Serializability

· Let S and S' be two schedules with the same set of transactions. S and S' are view equivalent if the following three conditions are met:

(a) For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then transaction Ti must in schedule S' also read the initial value of Q.

(b) For each data item Q, if transaction Tj executes read(Q) in schedule S, and that value was produced by transaction Ti (if any), then transaction Tj must in schedule S' also read the value of Q that was produced by transaction Ti .

(c) For each data item Q, the transaction (if any) that performs the final write(Q) operation in schedule S must perform the final write(Q) operation in schedule S'.

· A schedule S is view serializable if it is view equivalent to a serial schedule.

· Every conflict serializable schedule is also view serializable.

· Schedule IV below is a schedule which is view serializable but not conflict serializable.

It is view equivalent to the serial schedule T3, T4, T6 .

T3 T4 T6

read(Q)

write(Q)

write(Q)

write(Q)

Schedule IV



















· In schedule IV, T4 and T6 each performs a write(Q) without having performed a read(Q) operation. Such a write operation is called a blind writes.

It can be shown that blind writes appear in every view-serializable schedule that

is not conflict serializable.

No comments:

Post a Comment