Transactions:

A tomicity
C onsistency
I solation
D urability

   - Atomicity: execute all operations or none
   - Consistency: transform data from one consistent state to another consistent state
   - Isolation: execute correctly even in the presence of concurrent operations
   - Durability: changes made by transactions are not lost

Concurrency Control:
-------------------------


Transaction:

T1: r1(x) w1(x)

T2: r2(x) w2(x)



Concurrent operations:

Time flows down:              Assume x = 20

|  T1                         T2
|  r1(x) // x = 20
|  x = x + 5  // x = 25
|
|                            r2(x)  // x = 20
|                            x = x - 5  // x = 15
|                            w2(x)  writes 15
|
|  w1(x)  // writes 25
|
| Not serializable!!!
v
Time

Schedule:

S0: r1(x) r2(x) w1(x) w2(x)

S1: r1(x) w1(x) r2(x) w2(x)

S2: r2(x) w2(x) r1(x) w1(x)


S1:
Time flows down:              Assume x = 20

|  T1                         T2
|  r1(x) // x = 20
|  x = x + 5  // x = 25
|  w1(x)  // writes 25
|                            r2(x)  //x = 25
|                            x = x - 5  // x = 20
|                            w2(x)  writes 20
|
|
v
Time

S2:
Time flows down:              Assume x = 20

|  T1                         T2
|                            r2(x)  //x = 20
|                            x = x - 5  // x = 15
|                            w2(x)  writes 15
|  r1(x) // x = 15
|  x = x + 5  // x = 20
|  w1(x)  // writes 20
|
|
v
Time



What is a good schedule?
--------------------------

A serial schedule (where one transaction executes at a time) is a good
schedule: if I execute one transaction at a time, the resulting
database state is always correct.

A serializable schedule is one that is guaranteed to provide the same
results as a serial schedule.


Conflicting operations
------------------------

Two operations conflict with each other, if they are by two different
transactions and at least one of them is a write operation:

r1(x) w2(x)

w1(x) r2(x)

w1(x) w2(x)


If in two schedules, the conflicting operations occur in the same
order, then their result is guaranteed to be the same: then these two
schedules are equivalent


S0: r1(x) [ r2(x) w1(x) ] w2(x)   --> r2(x) reads values before it is changed

S0': r1(x) [ w1(x) r2(x)] w2(x)   --> r2(x) read the value after T1 writes

S0 and S0' are not guaranteed to produce the same result


A schedule S is serializable (equal to a serial schedule) iff all
conflicting operations in S occur in the same order as a serial
schedule.

----------


Conflicting operations
------------------------

Two operations conflict with each other, if they are by two different
transactions and one of them is a write operation

r1(x) w2(x)

w1(x) r2(x)

w1(x) w2(x)


Serializability:
-----------------------------

A schedule S is serializable (equal to a serial schedule) iff all
conflicting operations in S occur in the same order as a serial
schedule.


Checking Serializability:
----------------------------

Create a conflict graph for a schedule S:

  -- Each transaction is a node
  
  -- For each conflict in S of the form:

     ri(x) wj(x)
     wi(x) rj(x)
     wi(x) wj(x)

     Draw an edge from Ti to Tj.


T1->T2, T2->T1



If the resulting graph has a cycle, then this schedule is NOT serializable.


If the resulting graph has no cycles, then this schedule is
serializable. I can order transaction using topological sort on the
conflict graph.



Examples:

S1: r1(x) r1(y) w2(y) w2(z) r3(z) w3(k) r1(k) w2(l) w1(x)

S1': r1(x) r1(y) w2(y) w2(z) r3(z) r1(k) w3(k) w2(l) w1(x)



Concurrency Control:
-------------------------

Concurrent operations:


Schedule:

S0: r1(x) r2(x) w1(x) w2(x)


S1: r1(x)  w1(x) r2(x) w2(x)

T1 -> T2

S2:  r2(x) w2(x) r1(x) w1(x)

T2 -> T1

S1,S2: serial schedules


Conflicting operations
------------------------

Two operations conflict with each other, if they are by two different
transactions and one of them is a write operation

r1(x) w2(x)

w1(x) r2(x)

w1(x) w2(x)


Serializability:
-----------------------------

A schedule S is serializable (equal to a serial schedule) iff all
conflicting operations in S occur in the same order as a serial
schedule.


Checking Serializability:
----------------------------

Create a conflict graph for a schedule S:

  -- Each transaction is a node
  
  -- For each conflict in S of the form:

     ri(x) wj(x)
     wi(x) rj(x)
     wi(x) wj(x)

     Draw an edge from Ti to Tj.


T1->T2, T2->T1



If the resulting graph has a cycle, then this schedule is NOT serializable.


If the resulting graph has no cycles, then this schedule is
serializable. I can order transaction using topological sort on the
conflict graph.



Examples:

S2: r1(x) w2(x) w1(y) r3(y) w3(z) r2(z) r3(w) w2(z)

Conflicts:

r1(x) w2(x)
w1(y) r3(y)
w3(z) r2(z)
w3(z) w2(z)

Is it serializable? -> No cycles, it is serializable.

Equivalent serial order: T1, T3, T2

----------------------------------


S3: r1(x) w1(x) w1(z) w3(z) r4(z) w2(w) r4(w) r2(w) r3(x) r2(z) w4(w) w4(z)

Conflicts:

w1(x) r3(x)
w1(z) r4(z)
w1(z) w3(z)
w1(z) w4(z)
w1(z) r2(z)
w3(z) r2(z)
w3(z) r4(z)
w3(z) w4(z)
w2(w) w4(w)
w2(w) r4(w)

Serializable, no cycles.

T1, T3, T2, T4




Concurrency Control:
----------------------

Each transaction must obtain a lock for each operation, read (shared
lock) to read an item and write (exclusive) lock to write an item.

Read/Shared locks: Multiple transactions can hold shared lock on the
same item


Write/Exclusive locks: Only one transaction can hold a write lock on a
specific item



2PL: Two phase locking:
-------------------------

Any transaction can be in only one of the two phases:

Growing phase: A transaction can get new locks. As soon as xact
releases a lock, then it enters shrinking phase.

Shrinking phase: A transaction can only release locks (but cannot get
new locks).


If a transaction manager implements two phase locking, then the resulting
schedules are serializable.


Proof: showing no cycles are possible using Two phase locking.
------------


Strict 2PL:

No shrinking phase: Get locks and keep them until commit time.

All schedules resulting from strict 2PL are also guaranteed to be
serializable.

--------------

 r1(x)  ....  w1(x)

 shared_lock1(x) r1(x)  ....  upgrade_lock1(x) w1(x)








Concurrency control with locks
----------------------------------

Two phase locking/Strict two phase locking


|  T1                         T2
|  sl1(x)
|  r1(x)
|                            sl2(y)
|                            r2(y)
|  request xl1(y) to w1(y)
|  wait
|                            request xl2(x) to w2(x)
|                            wait
|
|
v
Time


       Wait for
    /-----------|
   /            v
/----\       /----\
| T1 |       | T2 |
\----/       \----/
   ^           /
   |----------/
     Wait for

Deadlocks:

   -> Periodically check for deadlocks: wait-for graph
   -> Avoid deadlocks:

---------------------

What are we locking?
---------------------

Relation

Page

Tuple


Trade off: high concurrency - high overhead

--------------------------

Lock Granularity
-----------------

IS
IX
S
X

PHANTOM UPDATE
-----------------

Read uncommitted  -> possible to read uncommitted data
Read committed    -> possible to read the same data twice and get different results
Repeatable read   -> possible to have phantom updates
Serializable      -> not  possible to have phantom updates


T1:
SELECT count(*) FROM episodes;

Locked all tuples using S (or IS)

T2:
INSERT INTO episodes VALUES(....) ;

<< possible since no lock on a new tuple!>>


To implement serializability: lock a condition!

SELECT count(*) FROM episodes WHERE year = 2020;