In the world of relational databases circular references are schema structures where foreign keys relating the tables create a loop. Circular references cause special types of issues when trying to synchronize two relational database where the foreign keys are enforced. Because of this issue, database schemas that contain circular references are restricted in the tools that can be used when synchronizing and replicating the database. This article will explain circular references and demonstrate a Transact-SQL script for determining if your database has a circular reference.
Foreign keys create database-enforced integrity constraints. These constraints ensure that a row of data exists in one table before another table can reference it. They also prevent a dependent row from being deleted that another row references. In Figure 1 we see a simple foreign key between Address table and StateProvince table in the Adventure Works database.
Figure 1
A circular reference is one or more tables where the foreign keys create a loop. Figure 2 is an example.
Figure 2
In this case the City table contains a reference to the author; it is the author that wrote the description for the city. The Author table has a reference to the city, because each author lives in a city. So which came first, the city or the author? In all cases with circular references one of the foreign key columns must be accept a null value. This allows the data to be inserted in 3 passes:
A circular reference is not limited to two tables, it might involve many tables, all bound together in one big circle.
A special case circular reference is the self-referencing table. This is a table that has a foreign key column that references its own primary key. An example is a human resource schema that tracks employees and their bosses. In the employee table, there is a foreign key column called boss that references the primary key column in the employee table. Self-referencing tables always have a foreign key column which is nullable and at least one null exists. In the example above it would be the CEO, since he doesn’t have a boss his boss column is null.
Tables that are not involved in a circular reference are easy to synchronize, you make a complete table update the table without dependencies on it, then update the tables with foreign key dependences. In Figure 1 you would update the StateProvince table, then the Address table. This explanation is simplified, for example the deletes are done in the reverse order. If the tables have no circular references you can synchronize them table by table if you know their dependency order.
Synchronizing tables with circular references is much harder, because you have to update the tables row by row, jumping back and forth between the tables, inserting the nullable foreign key with nulls first, then updating them later. Again this is a simplified explanation; the point is that you can’t update the tables in a serial order if there are circular references.
There are really only a couple ways to synchronize database that contains tables with circular references:
The Transact-SQL script below uses a recursive cursor to detect if there are any circular references in your database schema. It can be run on your SQL Server database before you try to synchronize it with SQL Azure, or you can run it on your SQL Azure database. You can run it in the Query Window of SQL Server Management Studio; the output will be displayed as in the Message section.
If you have circular references the output will look like this:
dbo.City -> dbo.Author -> dbo.City dbo.Division -> dbo.Author -> dbo.City -> dbo.County -> dbo.Region -> dbo.Image -> dbo.Division dbo.State -> dbo.Image -> dbo.Area -> dbo.Author -> dbo.City -> dbo.County -> dbo.Region -> dbo.State dbo.County -> dbo.Region -> dbo.Author -> dbo.City -> dbo.County dbo.Image -> dbo.Area -> dbo.Author -> dbo.City -> dbo.County -> dbo.Region -> dbo.Image dbo.Location -> dbo.Author -> dbo.City -> dbo.County -> dbo.Region -> dbo.Image -> dbo.Location dbo.LGroup -> dbo.LGroup dbo.Region -> dbo.Author -> dbo.City -> dbo.County -> dbo.Region dbo.Author -> dbo.City -> dbo.Author dbo.Area -> dbo.Author -> dbo.City -> dbo.County -> dbo.Region -> dbo.Image -> dbo.Area
Each line is a circular reference, with a link list of tables that create the circle. The Transact-SQL script to detect circular references is below, however you can also download from this page. This code will work on SQL Azure and SQL Server.
SET NOCOUNT ON -- WWB: Create a Temp Table Of All Relationship To Improve Overall Performance CREATE TABLE #TableRelationships (FK_Schema nvarchar(max), FK_Table nvarchar(max), PK_Schema nvarchar(max), PK_Table nvarchar(max)) -- WWB: Create a List Of All Tables To Check CREATE TABLE #TableList ([Schema] nvarchar(max), [Table] nvarchar(max)) -- WWB: Fill the Table List INSERT INTO #TableList ([Table], [Schema]) SELECT TABLE_NAME, TABLE_SCHEMA FROM INFORMATION_SCHEMA.TABLES WHERE Table_Type = 'BASE TABLE' -- WWB: Fill the RelationShip Temp Table INSERT INTO #TableRelationships(FK_Schema, FK_Table, PK_Schema, PK_Table) SELECT FK.TABLE_SCHEMA, FK.TABLE_NAME, PK.TABLE_SCHEMA, PK.TABLE_NAME FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS C INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS FK ON C.CONSTRAINT_NAME = FK.CONSTRAINT_NAME INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS PK ON C.UNIQUE_CONSTRAINT_NAME = PK.CONSTRAINT_NAME INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE CU ON C.CONSTRAINT_NAME = CU.CONSTRAINT_NAME INNER JOIN ( SELECT i1.TABLE_NAME, i2.COLUMN_NAME FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS i1 INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE i2 ON i1.CONSTRAINT_NAME = i2.CONSTRAINT_NAME WHERE i1.CONSTRAINT_TYPE = 'PRIMARY KEY' ) PT ON PT.TABLE_NAME = PK.TABLE_NAME CREATE TABLE #Stack([Schema] nvarchar(max), [Table] nvarchar(max)) GO -- WWB: Drop SqlAzureRecursiveFind IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[SqlAzureRecursiveFind]') AND type in (N'P', N'PC')) DROP PROCEDURE [dbo].[SqlAzureRecursiveFind] GO -- WWB: Create a Stored Procedure that Recursively Calls Itself CREATE PROC SqlAzureRecursiveFind @BaseSchmea nvarchar(max), @BaseTable nvarchar(max), @Schmea nvarchar(max), @Table nvarchar(max), @Fail nvarchar(max) OUTPUT AS SET NOCOUNT ON -- WWB: Keep Track Of the Schema and Tables We Have Checked -- Prevents Looping INSERT INTO #Stack([Schema],[Table]) VALUES (@Schmea, @Table) DECLARE @RelatedSchema nvarchar(max) DECLARE @RelatedTable nvarchar(max) -- WWB: Select all tables that the input table is dependent on DECLARE table_cursor CURSOR LOCAL FOR SELECT PK_Schema, PK_Table FROM #TableRelationships WHERE FK_Schema = @Schmea AND FK_Table = @Table OPEN table_cursor; -- Perform the first fetch. FETCH NEXT FROM table_cursor INTO @RelatedSchema, @RelatedTable; -- Check @@FETCH_STATUS to see if there are any more rows to fetch. WHILE @@FETCH_STATUS = 0 BEGIN -- WWB: If We have Recurred To Where We Start This -- Is a Circular Reference -- Begin failing out of the recursions IF (@BaseSchmea = @RelatedSchema AND @BaseTable = @RelatedTable) BEGIN SET @Fail = @RelatedSchema + '.' + @RelatedTable RETURN END ELSE BEGIN DECLARE @Count int -- WWB: Check to make sure that the dependencies are not in the stack -- If they are we don't need to go down this branch SELECT @Count = COUNT(1) FROM #Stack WHERE #Stack.[Schema] = @RelatedSchema AND #Stack.[Table] = @RelatedTable IF (@Count=0) BEGIN -- WWB: Recurse EXECUTE SqlAzureRecursiveFind @BaseSchmea, @BaseTable, @RelatedSchema, @RelatedTable, @Fail OUTPUT IF (LEN(@Fail) > 0) BEGIN -- WWB: If the Call Fails, Build the Output Up SET @Fail = @RelatedSchema + '.' + @RelatedTable + ' -> ' + @Fail RETURN END END END -- This is executed as long as the previous fetch succeeds. FETCH NEXT FROM table_cursor INTO @RelatedSchema, @RelatedTable; END CLOSE table_cursor; DEALLOCATE table_cursor; GO SET NOCOUNT ON DECLARE @Schema nvarchar(max) DECLARE @Table nvarchar(max) DECLARE @Fail nvarchar(max) -- WWB: Loop Through All the Tables In the Database Checking Each One DECLARE list_cursor CURSOR FOR SELECT [Schema], [Table] FROM #TableList OPEN list_cursor; -- Perform the first fetch. FETCH NEXT FROM list_cursor INTO @Schema, @Table; -- Check @@FETCH_STATUS to see if there are any more rows to fetch. WHILE @@FETCH_STATUS = 0 BEGIN -- WWB: Clear the Stack (Don't you love Global Variables) DELETE #Stack -- WWB: Initialize the Input SET @Fail = '' -- WWB: Check the Table EXECUTE SqlAzureRecursiveFind @Schema, @Table, @Schema, @Table, @Fail OUTPUT IF (LEN(@Fail) > 0) BEGIN -- WWB: Failed, Output SET @Fail = @Schema + '.' + @Table + ' -> ' + @Fail PRINT @Fail END -- This is executed as long as the previous fetch succeeds. FETCH NEXT FROM list_cursor INTO @Schema, @Table; END -- WWB: Clean Up CLOSE list_cursor; DEALLOCATE list_cursor; DROP TABLE #TableRelationships DROP TABLE #Stack DROP TABLE #TableList DROP PROC SqlAzureRecursiveFind
Do you have questions, concerns, comments? Post them below and we will try to address them.
Waine,
Back in 2004 I wrote a stored procedure quite similar to yours. If you get curious and want to see how mine was implemented, it is available at weblogs.asp.net/.../Stored-Procedure-ForeignkeysAnalyze.aspx
Cheers,
Luciano,
Very interesting, I search for a long time trying to find something, before writing my own. I didn't find your script. You need to update your CSS, modify the wrap on your script, or provide it as a download. There is a formatting error with your web site that prevents copy and paste of the script.
-Wayne
I can name that tune in one query :)
with q as
(
select object_id fk, referenced_object_id, 0 fk_count, '<fk id="' + cast(object_id as varchar(max))+ '"/>' reference_path, 0 done
from sys.foreign_keys
union all
select object_id fk, k.referenced_object_id, q.fk_count + 1, q.reference_path + '<fk id="' + + cast(k.object_id as varchar(max)) + '"/>',
case when q.reference_path like '%' + cast(k.object_id as varchar(max)) + '%' then 1 else 0 end done
from q
join sys.foreign_keys k
on q.referenced_object_id = k.parent_object_id
and q.done = 0
), qq as
select q.fk, CAST(reference_path as XML).query('
for $e in /fk
where $e >> /fk[1]
order by $e/@id
return $e') path_xml
where q.done = 1
)
select cast(path_xml as varchar(max)) cycle, fk.name fk_name, object_name(fk.parent_object_id) table_name
from qq
join sys.foreign_keys fk
on qq.fk = fk.object_id
order by cast(path_xml as varchar(max))
David
Wayne,
I embedded a dbo.ForeignkeysAnalyze.storedprocedure.sql.txt file to the post. It is hard to note the post has an attachment because attachments at weblogs.asp.net are not given much visibility, so, in case someone be not able to find out where on my post the attachment is, he/she can reach it directly at weblogs.asp.net/.../70588.ashx
Wayne -- Please help. In your nice article, you state "If you have circular references the output will look like this" but you do not state "If you DO NOT have circular references the output will look like this". I am curious as to what the output should be if if there are no circular references. I ran this script in Sql Managment Studio 2005 Express against a database on Server 2005 and the "Results" pane just said "Command(s) completed successfully" so can I assume that I have no circular references? Please advise. Thank you. -- Mark Kamoski
Mark, What do you see in the message tab? PRINT commands go to the message tab.
Wayne -- I do not seem to have a "message tab" in my Sql Server 2005 Management Studio Express environment. Or maybe I just cannot find it? I posted a screenshot here www.logicbus.net/.../ImageForPostOnSql201009240913.PNG to let you know what it looks like. I am getting the feeling that this is "something simple" but I just cannot see it now. Ug. What do you think? Do you have any other suggestions? Please advise. Thank you. -- Mark Kamoski
Mark: You are right -- the results of no circular references is that there is no message tab, which only appears if a PRINT is executed.
Wayne -- That is GOOD news for me. It appears that my database schema is OK, at least in this respect. Who would have thought anything less? (Ahem.) Now, I just need to unravel Luciano's link weblogs.asp.net/.../Stored-Procedure-ForeignkeysAnalyze.aspx where he has a notion of "recursive relationships" and "doubtful relationships", which, at first glance, seem to be identified in a fuzzy and non-deterministic fashion, although probably worth looking at for analysis. Have you looked at his script and output at all and his definitions of "recursive relationships" and "doubtful relationships"? Please advise. Thank you. -- Mark Kamoski
well,well, federated databases!! commit searchers around different locations before commiting, data replicated across!! typical Entriprise distributed applications for supply chain and product data definitley left me with this kind of situation many times!!.. Berry`s work is good, definely good approach, hope it explands
Thanks.
It was a great help for me to find circular references in my db.
When, is SQL Azure data sync going to solve Circular reference problem?
when I run the script it outputs nothing (just 'Command(s) completed successfully.
') but when I try to deploy the sync I still get the error saying i have circular references!
It 's very good and useful code. thank you very mutch.