At the Intersection of PHP and Microsoft
Last week I had the good fortune of presenting at the the Seattle PHP Meet Up on a topic I knew only a little about: transaction isolation levels. It was fun doing the homework to learn more and I’ll share what I learned in this post. However, before diving in, I want to call out two things:
So, consider this post to be Database Transactions and Isolation Levels 101.
A database transaction is a unit of work (i.e. a group of queries) that should either completely succeed or completely fail. A classic example used to illustrate transactions is what happens when you transfer money from your checking account to your savings account. Typically, this would involve two update statements (showing a $10 transfer):
UPDATE BankAccount SET Checking = Checking - 10 WHERE AccountId = 1 UPDATE BankAccount SET Savings = Savings + 10 WHERE AccountId = 1
In order for the transfer to be successful, BOTH of these queries must succeed. And, if one of them fails, it would be best for the other to fail too (thus meaning that the transfer was unsuccessful, but money didn’t “magically” appear or disappear in one account). This is exactly what happens if we make these statements part of a transaction. When statements are part of a transaction, we can then commit the transaction (i.e. permanently save the results) if everything went according to our liking, or rollback the transaction (i.e. return to the state the database was in before we started the transaction) if something went wrong. A bit of PHP code makes this idea clear:
$serverName = "ServerName\sqlexpress"; $connectionInfo = array( "Database"=>"DatabaseName", "UID"=>"uid", "PWD"=>"pwd"); $conn = sqlsrv_connect( $serverName, $connectionInfo); //Initiate transaction. sqlsrv_begin_transaction( $conn ); //Initialize parameter values. $money = 10; $accountId = 1; // Set up and execute the first query. $tsql1 = "UPDATE BankAccount SET Checking = (Checking - ?) WHERE AccountId = ?"; $params1 = array($money, $accountId); $stmt1 = sqlsrv_query( $conn, $tsql1, $params1 ); // Set up and execute the second query. $tsql2 = "UPDATE BankAccount SET Savings = (Savings + ?) WHERE AccountId = ?"; $params2 = array($money, $accountId); $stmt2 = sqlsrv_query( $conn, $tsql2, $params2 ); //If both queries were successful, commit the transaction. //Otherwise, rollback the transaction. if( $stmt1 && $stmt2 ) { sqlsrv_commit( $conn ); echo "Transaction was committed.<br/>"; $stmt3 = sqlsrv_query($conn, "SELECT * FROM BankAccount WHERE AccountId = ?", array($accountId)); $row = sqlsrv_fetch_array($stmt3); echo "AccountId: ".$row[0]." Checking: ".$row[1]." Savings: ".$row[2]."<br/>"; } else { sqlsrv_rollback( $conn ); echo "Transaction was rolled back.<br/>"; $stmt3 = sqlsrv_query($conn, "SELECT * FROM BankAccount WHERE AccountId = ?", array($accountId)); $row = sqlsrv_fetch_array($stmt3); echo "AccountId: ".$row[0]." Checking: ".$row[1]." Savings: ".$row[2]."<br/>"; }
$serverName = "ServerName\sqlexpress"; $connectionInfo = array( "Database"=>"DatabaseName", "UID"=>"uid", "PWD"=>"pwd"); $conn = sqlsrv_connect( $serverName, $connectionInfo);
//Initiate transaction. sqlsrv_begin_transaction( $conn ); //Initialize parameter values. $money = 10; $accountId = 1;
// Set up and execute the first query. $tsql1 = "UPDATE BankAccount SET Checking = (Checking - ?) WHERE AccountId = ?"; $params1 = array($money, $accountId); $stmt1 = sqlsrv_query( $conn, $tsql1, $params1 );
// Set up and execute the second query. $tsql2 = "UPDATE BankAccount SET Savings = (Savings + ?) WHERE AccountId = ?"; $params2 = array($money, $accountId); $stmt2 = sqlsrv_query( $conn, $tsql2, $params2 );
//If both queries were successful, commit the transaction. //Otherwise, rollback the transaction. if( $stmt1 && $stmt2 ) { sqlsrv_commit( $conn ); echo "Transaction was committed.<br/>"; $stmt3 = sqlsrv_query($conn, "SELECT * FROM BankAccount WHERE AccountId = ?", array($accountId)); $row = sqlsrv_fetch_array($stmt3); echo "AccountId: ".$row[0]." Checking: ".$row[1]." Savings: ".$row[2]."<br/>"; } else { sqlsrv_rollback( $conn ); echo "Transaction was rolled back.<br/>"; $stmt3 = sqlsrv_query($conn, "SELECT * FROM BankAccount WHERE AccountId = ?", array($accountId)); $row = sqlsrv_fetch_array($stmt3); echo "AccountId: ".$row[0]." Checking: ".$row[1]." Savings: ".$row[2]."<br/>"; }
Note: The PHP code above uses the transaction APIs of the SQLSRV driver (sqlsrv_begin_transaction, sqlsrv_commit, sqlsrv_rollback) to handle transaction operations. This is the recommended way to begin and commit or rollback transactions even though it is possible, to use plain ‘ol SQL to do so (e.g. BEGIN TRANSACTION). The reason for this is that queries that are not part of an explicit transaction are part of an implicit transaction (this behavior is common to most RDMSs). If you begin a transaction like this, sqlsrv_query($conn, “BEGIN TRANSACTION”), that query itself is part of an implicit transaction (i.e. you have nested transactions). Rather than try to predict the outcome of such transactions, it is simply recommended that you use the driver APIs to handle transaction operations.
The code above should successfully commit the transaction. To see what happens if part of the transaction fails, change this line, $params2 = array($money, $accountId);, to this: $params2 = array(“ABCD”, $accountId);. This will cause the second UPDATE query to fail (“ABCD” is not the correct data type for the Savings column), thus forcing the entire transaction to be rolled back. The values in the accounts will be restored to their values before the transaction was initiated.
Transactions are relatively simple until we start wondering what happens when multiple users are trying to read and/or modify data while a transaction is in progress. Suppose that while the transaction above was in progress another user with access to the bank account tried to find out the balance for the checking account. What should this user see? The amount before the transaction was initiated? The amount after the transaction is committed? What if the transaction is rolled back? This is where transaction isolation levels come into play. However, it is difficult to talk about the effects of isolation levels with out first talking about three concepts related to concurrency:
Those concepts are important in understanding isolation levels.
Here, I will describe how the following transaction isolation levels can be used to change what multiple users can see when accessing data that is in transaction: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and SERIALIZABLE (all of which are supported by SQL Server, MySQL, Oracle, DB2). (For information about SNAPSHOT isolation, see this post: SQL Server Driver for PHP Connection Options: Snapshot Isolation.)
To effectively show what each of these isolation levels does, I need to be able to begin a transaction and leave it open (i.e. not commit it or roll it back). This is very easy to do using SQL Server Management Studio (SSMS) and Transact-SQL, but not so easy using PHP scripts since it’s difficult to execute only part of a script (at least it’s difficult when it comes to demonstrating transactions). You can download and install (for free) SQL Server Express and SQL Server Management Studio here in case you want to work along with this post. (Be sure to select Database with Management Tools or Database with Advanced Services.)
Here is the script for creating the database that I’ll use:
CREATE DATABASE [TestDB] GO USE [TestDB] GO CREATE TABLE [dbo].[BankAccount]( [AccountId] [int] NOT NULL, [Checking] [money] NOT NULL, [Savings] [money] NOT NULL, CONSTRAINT [PK_BankAccount] PRIMARY KEY CLUSTERED ( [AccountId] ASC ) ) ON [PRIMARY] GO INSERT INTO BankAccount (AccountId, Checking, Savings) VALUES (1, 100, 100) GO
CREATE DATABASE [TestDB] GO
USE [TestDB] GO
CREATE TABLE [dbo].[BankAccount]( [AccountId] [int] NOT NULL, [Checking] [money] NOT NULL, [Savings] [money] NOT NULL, CONSTRAINT [PK_BankAccount] PRIMARY KEY CLUSTERED ( [AccountId] ASC ) ) ON [PRIMARY] GO
INSERT INTO BankAccount (AccountId, Checking, Savings) VALUES (1, 100, 100) GO
With that database in place, I’ll demonstrate how to set the various transaction isolation levels and what affect they have.
When a transaction is run under the READ UNCOMITTED isolation level, other users can read data that has not yet been committed (i.e. dirty reads are allowed). For example, consider the following transaction (which you can execute in SSMS). Note that the transaction is still open (i.e. it has not been committed) since the COMMIT TRANSACTION command is commented out:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED BEGIN TRANSACTION DECLARE @money INT DECLARE @accountId INT SET @money = 10 SET @accountId = 1 UPDATE BankAccount SET Checking = Checking - @money WHERE AccountId = @accountId UPDATE BankAccount SET Savings = Savings + @money WHERE AccountId = @accountId --COMMIT TRANSACTION
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED BEGIN TRANSACTION
DECLARE @money INT DECLARE @accountId INT SET @money = 10 SET @accountId = 1
UPDATE BankAccount SET Checking = Checking - @money WHERE AccountId = @accountId UPDATE BankAccount SET Savings = Savings + @money WHERE AccountId = @accountId --COMMIT TRANSACTION
When another query tries to read the data that is in the transaction, the uncommitted data is returned. For example, if you execute this query in a new SSMS window…
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRANSACTION
SELECT * FROM BankAccount
COMMIT TRANSACTION
…you’ll see values of 90 and 110 returned for Checking and Savings respectively (i.e. uncommitted data is readable). If you highlight the COMMIT TRANSACTION in the first query window and press F5, you will commit the transaction. As I mentioned earlier, whether you want users to be able to read uncommitted data (also known as “dirty” data) will depend on the needs of your application.
The READ UNCOMITTED isolation level is the most permissive level. Not only does it allow dirty reads, it allows non-repeatable reads and phantom reads (which are demonstrated in the examples below).
When a transaction is run under the READ COMMITTED isolation level, users cannot read data that has not yet been committed; only committed data is readable. (This is the default isolation level for most databases.) To see this, execute this query in SSMS (note that the transaction has not been committed):
SET TRANSACTION ISOLATION LEVEL READ COMMITTED BEGIN TRANSACTION SELECT * FROM BankAccount --COMMIT TRANSACTION
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
--COMMIT TRANSACTION
You should see data returned for Checking and Savings.
Now execute the following query in a new SSMS window (again note that the transaction has not been committed):
BEGIN TRANSACTION DECLARE @money INT DECLARE @accountId INT SET @money = 10 SET @accountId = 1 UPDATE BankAccount SET Checking = Checking - @money WHERE AccountId = @accountId UPDATE BankAccount SET Savings = Savings + @money WHERE AccountId = @accountId --COMMIT TRANSACTION
Now return to the first query, highlight SELECT * FROM BankAccount, and press F5 (this executes the SELECT statement as part of the open transaction). , You will see this at the bottom of the query window:
Because the the isolation level is READ COMMITTED, only committed data will be returned. The query is blocked from reading any data because it is trying to access uncommitted data. When you commit the data (select COMMIT TRANSACTION in the second transaction and press F5), you will see data returned by the first query (it is no longer blocked because the data it is selecting is now committed).
Be sure to select COMMIT TRANSACTION in the first transaction and press F5 (so we don’t leave any transactions open).
While the READ COMMITTED isolation level does not allow dirty reads, it does allow non-repeatable reads and phantom reads (again, demonstrated below).
When a transaction is run under the REPEATABLE READ isolation level, users are guaranteed to get the same data from one read to the next within a transaction. To see this, execute the following query in SSMS:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ BEGIN TRANSACTION SELECT * FROM BankAccount --COMMIT TRANSACTION
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
Now execute the following query in a new SSMS window:
BEGIN TRANSACTION DECLARE @money INT DECLARE @accountId INT SET @money = 10 SET @accountId = 1 UPDATE BankAccount SET Checking = Checking - @money WHERE AccountId = @accountId UPDATE BankAccount SET Savings = Savings + @money WHERE AccountId = @accountId COMMIT TRANSACTION
UPDATE BankAccount SET Checking = Checking - @money WHERE AccountId = @accountId UPDATE BankAccount SET Savings = Savings + @money WHERE AccountId = @accountId COMMIT TRANSACTION
You will see this at the bottom of the query window:
Now return to the first query, highlight SELECT * FROM BankAccount, and press F5 (this executes the SELECT statement as part of the open transaction). Note that the same data is returned (you repeated a read). When you commit the transaction (highlight COMMIT TRANSACTION and press F5), you’ll find that the second query completes (i.e. the data is updated).
Because the the isolation level is REPEATABLE READ, the second query is blocked from updating data so that reads can be repeated in the first query. Running the transaction under the READ UNCOMMITTED or READ UNCOMMITTED isolation levels would not block the updating query from executing (thus allowing for a non-repeatable read). Note that the REPEATABLE READ isolation level does allow for phantom reads (explained below).
When a transaction is run under the SERIALIZABLE isolation level, other users will not get dirty reads, unrepeatable reads, or phantom reads. You can repeat any of the above scenarios with the isolation level set to SERIALIZABLE to see that dirty reads and non-repeatable reads are not possible. To see how this isolation level prevents phantom reads, consider the following example. (Note: to see an example of a phantom read, follow the example with the isolation level set to any of the other levels mentioned above).
Execute the following query in SSMS:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRANSACTION SELECT * FROM BankAccount --COMMIT TRANSACTION
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
In a new SSMS window, execute the following query:
BEGIN TRANSACTION INSERT INTO BankAccount (AccountId, Checking, Savings) VALUES (2, 200, 200) COMMIT TRANSACTION
INSERT INTO BankAccount (AccountId, Checking, Savings) VALUES (2, 200, 200)
You will now see the familiar…
…indicating that the query is blocked. Return to the first query, highlight SELECT * FROM BankAccount, and press F5. Note that you do not see a second row returned (phantom reads are not possible). When you select COMMIT TRANSACTION and press F5 in the first query, the INSERT in the second query will then be allowed to complete.
Isolation levels govern what data concurrent users can see during transactions. The most permissive isolation level is READ UNCOMMITTED and the most restrictive is SERIALIZABLE. The following table shows this in more detail:
As demonstrated above, data is locked according to the isolation level of a transaction. The implications of locked data should be considered carefully when designing an application that utilizes transactions.
Both the SQLSRV and PDO_SQLSRV drivers (downloadable here) support transactions and transaction isolation levels. The transaction isolation level is set when establishing a connection to the server. For more information, see the IsolationLevel keyword in the Connection Options topic of the driver documentation. Also note that the driver APIs for executing transactions (sqlsrv_begin_transaction, sqlsrv_commit, sqlsrv_rollback in the SQLSRV driver, and PDO::beginTransaction, PDO::commit, PDO::rollback in the PDO_SQLSRV driver) should be used instead of executing a SQL such as sqlsrv_query($conn, “BEGIN TRANSACTION”).
That’s it for today. Hope this proves helpful.
Thanks.
-Brian
Share this on Twitter
What a great article! I was at the presentation (which was amazingly good) and am very glad to have the great content here in easy-to-reference form.
Nice job. And thanks for the reminder to not force Transactions but rather use the transaction forms built-in to the abstraction services (SQLSRV, PDO, etc.). I was just about to...
Thanks! I'm glad you found it helpful...hopefully, others will too. :-)
First example seemes to me somehow wrong.
if( $stmt1 && $stmt2 )
{
sqlsrv_commit( $conn );
echo "Transaction was committed.<br/>";
...
}
if sqlsrv_commit throws exception, as it might, your data will get corrupted. Instead of if( $stmt1 && $stmt2 ) you should do something like:
try {
(..)
} catch (Exception $e) {
sqlsrv_rollback( $conn );
this way You are 100% sure, your data will not get corrupted.
@yon85-
Thanks for the comment. I agree with you that some sort of error handling should be done for the call to sqlsrv_commit. However, the sqlsrv extension does not support exceptions. To get any error information, you have to call sqlsrv_errors. So, something like this might work:
if($stmt1 && $stmt2)
if(!sqlsrv_commit($conn))
// call sqlsrv_errors here
// take appropriate action
else
sqlsrv_rollback($conn);
That said, the pdo_sqlsrv extension DOES support exceptions. If you use that extension, you could write code like you suggested (which is much more elegant than the code above).