Serializability

Comprehensive study notes, diagrams, and exam preparation for Serializability.

Serializability

Definition

Serializability is a fundamental property of database transaction scheduling that ensures a concurrent execution of multiple transactions produces the same result as if those transactions were executed one after another (serially) in some order. It is the gold standard for maintaining database consistency and isolation in multi-user systems.


Main Content

1. Conflict Serializability

  • A schedule is conflict-serializable if it can be transformed into a serial schedule by swapping non-conflicting operations.
  • Two operations conflict if they belong to different transactions, access the same data item, and at least one of them is a "write" operation.

2. View Serializability

  • A schedule is view-serializable if it is "view equivalent" to some serial schedule.
  • This is a more relaxed condition than conflict serializability; a schedule might be view-serializable even if it is not conflict-serializable, provided the final state of the data and the source of each read operation remain consistent.

3. Recoverable Schedules

  • Serializability must often be combined with recoverability to ensure that if a transaction fails, the system can revert to a consistent state.
  • A schedule is recoverable if a transaction $T_i$ only commits after all transactions $T_j$ from which $T_i$ read data have already committed.

Working / Process

1. Dependency Graph Construction

  • Create a node for each transaction participating in the schedule.
  • Draw a directed edge from $T_i$ to $T_j$ if $T_i$ performs an operation that conflicts with an operation performed by $T_j$, and $T_i$ executes that operation first.

2. Cycle Detection

  • Analyze the constructed precedence graph to identify the presence of cycles.
  • If a cycle exists, the schedule is not conflict-serializable. If no cycles exist, the schedule is conflict-serializable.

3. Topological Sorting

  • If the graph is acyclic, perform a topological sort to determine the equivalent serial execution order.
  • The sorted order of transactions represents the serial sequence that produces the same final database state.
[T1] ---> [T2] ---> [T3]

(Above: A DAG showing a conflict-serializable order T1 -> T2 -> T3)

Advantages / Applications

  • Maintains data integrity by preventing anomalies like "lost updates," "dirty reads," and "unrepeatable reads."
  • Enables high-performance concurrent processing in banking systems, e-commerce platforms, and airline reservation systems.
  • Provides a theoretical foundation for concurrency control protocols like Two-Phase Locking (2PL) and Timestamp Ordering.

Summary

  • Serializability guarantees that concurrent transactions yield the same output as a serial execution.
  • It prevents data corruption in multi-user environments by enforcing strict ordering.
  • Conflict serializability is the most common practical standard for database engines.
  • Important terms: Conflict, Precedence Graph, Transaction, Schedule, Atomicity, and Isolation.