Distributed Partitioned Views / Federated Databases: Lessons Learned

Distributed Partitioned Views / Federated Databases: Lessons Learned

Rate This
  • Comments 14

Introduction

This article contains information about the things we have learned while working with Federated Databases.  Before beginning it is necessary to define the terms being used.  Included in this article is one solution in production that is using federated servers and Distributed Partitioned Views.

 

Definitions

Definition 1:  Local Partitioned View – A single table is horizontally split into multiple tables, usually all have the same structure.

 

Example 1:

Original table is named FACT.

New tables are named FACT2005, FACT2006, and FACT2007.  (I chose to partition by date, you can choose anything else that makes sense for your application, like branch office).

 

CREATE VIEW FACT AS

SELECT <select list> FROM dbo.FACT2005

UNION ALL

SELECT <select list> FROM dbo.FACT2006

UNION ALL

SELECT <select list> FROM dbo.FACT2007

 

You will notice that the view is created with the same name as the original table.  This way the applications don’t have to change.

 

You also have to create a check constraint on the same column in each table to create a partitioning “key”.  Read more about this and other very important restrictions in Books on Line at: ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/aecc2f73-2ab5-4db9-b1e6-2f9e3c601fb9.htm.  There are several good places in Books on Line to read about partitioned views; this link will give you a launching point for the other ones.

 

Note:  Local partitioned views are popular in both OLTP and data warehouse projects.  We rarely hear about problems with these, but there are known problems with very complex queries when the optimizer does not do partition elimination.  I will not spend much time on local partitioned views in this article as I would like to concentrate on Distributed Partitioned Views.

 

Definition 2: Cross Database Partitioned View – tables are split among different databases on the same server instance. Notice the three part name using the database in Example 2 below.

 

Example 2: 

CREATE VIEW FACT AS

SELECT <select list> FROM DB2005.dbo.FACT2005

UNION ALL

SELECT <select list> FROM DB2006.dbo.FACT2006

UNION ALL

SELECT <select list> FROM DB2007.dbo.FACT2007

 

Note:  The most frequent question I get from people attempting a view like this that has tables in multiple databases on the same instance is about joins.  You don’t lose too much performance with cross database joins.  This is something to think about because you will normally join this view to some reference tables for the application.  If you carry the reference tables in a database called COMMON for example, then you will most likely see something like SELECT * FROM dbo.FACT JOIN COMMON.dbo.Customer ON ….. WHERE …

 

You will also notice in this example that I put each fact table in its own database to facilitate easier scale out across servers or instances if you need to do this at a later time.

 

 

Definition 3: Distributed (across server or instance) Partitioned View.  Tables participating in the view reside in different databases which reside on different servers or different instances.  Note the four part name which includes the actual server name (or the cluster name if this is in a Windows Failover Cluster).

 

Example 3: 

CREATE VIEW FACT AS

SELECT <select list> FROM              DB2005.dbo.FACT2005

UNION ALL

SELECT <select list> FROM server2.DB2006.dbo.FACT2006

UNION ALL

SELECT <select list> FROM server3.DB2007.dbo.FACT2007

 

You will notice that the server name is missing from the first server.  This view definition in Example 3 exists on server1.  You cannot use a linked server to refer to the local server.    You might immediately recognize a potential problem and try to create the view in Example 4 on server 2.  However, the trick is to change the linked server definitions so that the same view code in Example 3 is deployed to every server. 

 

BAD Example 4:  (don’t do this)

CREATE VIEW FACT AS

SELECT <select list> FROM  server1.DB2005.dbo.FACT2005

UNION ALL

SELECT <select list> FROM               DB2006.dbo.FACT2006

UNION ALL

SELECT <select list> FROM server3.DB2007.dbo.FACT2007

 

Instead of doing the BAD Example 4, use the style in Example 3 and change the linked server definitions on every server.

 

 

Infrastructure architecture for one DPV project

Here is one production project to be used as a reference for DPV.  The customer had 3,000 branch offices to automate and determined that the total size representing 3 months worth of data was going to be approximately 3.6TB.  They didn’t want to risk putting this all on one server, so their basic approach was to break this into manageable size pieces.  There are 6 servers in 2 geographically separated data centers.  Each server has 2 instances so that if one server gets too busy they can easily move the second instance to another server.  The data is spread out over 12 instances.  The servers are 8 socket dual core machines with 16GB RAM.  Each instance is responsible for approximately 300GB.  Growth is expected to double soon which is why they selected machines this big.

 

Another reason they bought bigger machines than they need for a normal load is to have a high availability strategy.  The three machines in each data center are in one cluster and if one machine goes down another machine can pick up the load.  If two machines go down, one machine will do the work of all 3 and they expect that performance will be slow until the problem is fixed.  If an entire data center goes down there is no solution in place yet – this is a later phase of the project.

 

The most important point in the success of this project and makes this project work so well is that they are not using load balancing.  The users at each branch are connected directly to a server that contains their data.  So even though most of the inserts, updates and deletes are done through the partitioned views, the work is mostly local to one server.  There are some corporate users issuing queries that need data from multiple instances and it is expected that most of these queries will touch multiple servers.

 

Reasons for Distributing:

The main reason for distributing the 3.6TB over 12 servers was to break the data into manageable size pieces.  This makes the hard maintenance problem in a VLDB much easier to deal with; i.e. backups, index and statistics maintenance. 

 

If you use table partitioning as well as the partitioned views, then you have even smaller pieces to manage for most of your administrative processes.

 

 

WARNING:

Although some federated server projects are successful, we have found several problems during these projects.  Fortunately we found solutions for each problem.  The warning is that we don’t know if we have uncovered all the problems.  Your workload may be different enough that other problems will be discovered.  So test, test, test if you decide to distribute a single database over multiple servers.

 

 

Two main problem patterns

The two main problems we have found so far are:

1)      When a command gets sent to every server when you think it should only go to one server.  This happens when the query optimizer thinks it has to check the schema on every server, as in the case when the same collations are not used on all the servers (see note below in the lessons learned section).

2)      When cross-server join copies the records from the remote server to the local server and then performs the join.  This is called a non-remotable query.  The optimizer is pretty good at copying the smaller table (or result set) to the right server before performing the join.  Still, it is a situation that should be avoided in order to get the most consistent performance.  Try to make all the joins happen on one server (either all on the remote or all on the local) without copying records across the network.  See the notes below on advice on how to avoid this.

 

 

Here is a list of the things we have learned to do and not do on this project. 

 

Lessons Learned on Distributed Partitioned Views: (multiple servers involved)

  1. Follow the guidelines in Books On Line very carefully.  There are many links from this main one.  I recommend reading and re-reading these until you know the subject very well before you start.  ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/6e44b9c2-035e-4c88-907f-eef880c5540e.htm.  If you are on-line and can get to this site:  http://msdn2.microsoft.com/en-us/library/ms175129(SQL.90).aspx, read it and any of the links it has on this page.
  2. Avoid cross server joins whenever possible.  One solution is to replicate all your reference/dimension tables to every server.  In a cross server join, the necessary records are copied from the remote server to the local server, then the join is performed. 
  3. Use the same collations in all databases.  Otherwise the startup filters are not applied and the queries are always sent to servers with different collation.
  4. Use the same session settings in all connections.  Otherwise startup filters are not applied and the queries are always sent to servers with different session settings.
  5. Create an index with the partitioned column as the leading column on the index because most of your queries will contain the partitioned column in the where clause.  The optimizer uses this index and the associated statistics to do more efficient queries.
  6. Plan guides do not work for distributed queries.
  7. After restarting SQL Server, create some startup stored procedures that will run the queries that you need.  Otherwise the first user running each query will pay a heavier penalty because it has to touch every server.  These startup procedures will also create a local connection pool to each server.  Even though creating a connection is fast it will still be better if the first user doesn’t have to wait for this too. 
  8. Don’t forget to do the tip in Books-On-Line:  Turn on Lazy Schema Validation.  This will give you better performance.  It helps to avoid sending all queries to all servers.
  9. Always have the correct data type in your statements.  Avoid situations where SQL Server has to automatically convert to a type in your table.  Otherwise the query always gets sent to every server for execution.   In the following example, the partition key column (name) is defined in all the tables as NVARCHAR.

 

DON’T DO THIS

SELECT column1 FROM myPartitionedView WHERE name = ‘ABC’  -- note ‘ABC’ is interpreted as VARCHAR.

 

DO THIS

SELECT column1 FROM myPartitionedView WHERE name = N’ABC’

Or

SELECT column1 FROM myPartitionedView WHERE name = CAST (NVARCHAR, ‘ABC’)

 

  1. Don’t use non-deterministic functions directly in your insert or update statements.   This causes the optimizer to send the query to all servers.

 

DON’T DO THIS

UPDATE myPartitionedView SET column1 = GETDATE() WHERE …

 

DO THIS

DECLARE @dtNow datetime

SET @dtNow = GETDATE()

UPDATE myPartitionedView SET column1 = @dtNow WHERE …

 

  1. The Estimated plan appears to show that all servers will be touched all the time.  But in reality the startup filter will eliminate the partitions at run time.  This is by design and the challenge is to teach yourself about the definition of a startup filter.  You can find a good write up at: http://www.sqlmag.com/articles/index.cfm?ArticleID=9097.  (you must have an account to read this).  I have not found much else written on the topic of a start up filter and I wish I could paste a screen capture in this blog because it is easier to understand once you see the picture.
  2. To avoid the problem of some high severity errors not getting returned properly from the remote server use the following line of code:

DO THIS

SET IMPLICIT_TRANSACTIONS ON

 

What happens is that the Try…Catch blocks are not activated properly on the local server.  For example, if the following insert generated a duplicate key error, then the whole batch aborts and the Commit Transaction as well as the Try…Catch are ignored.  This could leave a distributed transaction open.  With the SET IMPLICIT_TRANSACTION ON command, it works properly.

 

Begin Try

   Begin Transaction

      Insert myPartitionedView (col1, col2) VALUES (10, ‘a’)

   Commit Transaction

End Try

Begin Catch

   Print “error occurred”

   Rollback Transaction

End Catch

 

  1. The UI for the actual plan does not show which partitions were really touched.  You have to SET STATISTICS PROFILE ON to get the following line for each partition.

 

<RunTimeCountersPerThread Thread="0" ActualRebinds="0" ActualRewinds="0" ActualRows="0" ActualEndOfScans="0" ActualExecutions="0" />

 

Debugging performance problems with Distributed Queries

The easiest method is to start SQL Profiler on each server to see which commands are being sent to which server.  In general, the startup filters will be applied and the commands will only be sent to the correct servers.  Also use the SET STATISTICS PROFILE ON that was documented above.

 

Kevin Cox, Teodoro Marinucci, Silvia Bevilacqua

(also thanks to Stefano Stefani, Gargi Sur and Rakesh Gujjula from the SQL Product Team)

Leave a Comment
  • Please add 5 and 2 and type the answer here:
  • Post
  • The Microsoft SQL Server Development Customer Advisory Team has a nice blog post about Distributed Partitioned

  • Great overview/summary..thanks!

  • Good article. I have a question on Definition 3, around the GOOD and BAD examples for the partitioned view.

    If server1 hosts DB2005 locally, how will the same view work on server2? I can see how changing the linked server definition gives you indirection for the non-local databases, but what happens to the local database? What am I missing here? The BAD example looks to me to be the 'correct' (but inelegant) solution.

    Thanks

    Ewan

  • great article. I'm also having difficulty implementing definition 3 above:

    On Server1: when you create a linked server in sql server 2005, in the "New Linked Server" window / general Tab, If I use an alias like for example "Server2", and then choose "Other data source", and then I put in the server ip in the data source, it gets added, but I cannot run queries against it, I get the following error:

    OLE DB provider "SQLNCLI" for linked server "SERVER2" returned message "Login timeout expired".

    OLE DB provider "SQLNCLI" for linked server "SERVER2" returned message "An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections.".

    but when I use the option "Sql Server" in the Sql type, it works but I cannot use the alias "Server2" in that case, hence I will not be able to do it like in definition 3.

  • Good article.

    May you post more articles about this theme.

    More detail would be welcome.

    regards

  • Good article.

    We have VLDB on SQL2000. Our primary table contains slightly less than 1B records, and I am anticipating 450M records per year.

    Thus we have implemented distributed view on 2000 across 6 clusters, and it works reasonably well. As we are migrating to SQL2005, I am not certain of the maturity of partitioning to migrate to that architecture, as well as having everything on a single server. My first reaction is to architect a hybrid solution, where I would use partitioning on the local servers, and then have a distributed view on top of all the partitions.

    I have concern implementing the above as MS is saying that distributed view is going to be removed from future releases, but I doubt that partitioning would be able to handle our data volume.

    Any thoughts would be appreciated.

  • Distributed view is not going to be removed from future releases, at least not any time soon.

    Table Partitioning would handle your data volume as 1B records is not very much in today's world.  Of course that depends on what type of hardware you are running on.  But if you choose table partitioning you may be able to minimize the impact by marking some of the partitions with older data as read-only.  

    Your hybrid solution could also work well because you can use table partitioning on the local servers as well as use a distributed view on top of all the tables.  The Optimizer is very good in this case to only touch the tables and partitions it needs if you supply all the partitioning keys in the queries.

  • If identities are not allowed in distributed views.

    May you help me underestand how do you handle non significant pk fields? do you use a separated table with a counter per table?

    Where can I get an detailed example?

    Thanks

  • A GUID is the most popular solution instead of using Identities when you want to create a partitioned view.  

  • Nice to see an article on this older strategy.

    I will say GUIDs are probably the worst solution when it comes to Clustered Index fragmentation due to their very nature.

  • Nice to see an article on this older strategy.

    I will say GUIDs are probably the worst solution when it comes to Clustered Index fragmentation due to their very nature.

  • Nice to see an article on this older strategy.

    I will say GUIDs are probably the worst solution when it comes to Clustered Index fragmentation due to their very nature.

  • I said GUIDS are the most popular solution for DPV, not necessarily the most efficient.  Although I like them because of their simplicity.  And using a FILL FACTOR of 60 - 70 is a good solution for index maintenance jobs if you are using GUIDS.  This minimizes the page splits caused by frequent inserts into the middle of the table/indexes.  

  • An innovative strategy but it does not seem to tackle some of the major challenges inherent in MPP.

    Such distributions are by there nature more prone to failures  ie If one system has a MTBF of X then a cluster of N systems will have a MTBF of X/N  and fail at least N times as often.

    This problem does not seem to be addressed.

    Also, joins are greatly restricted.  In an MPP design joins should run at least N times faster.  This is not the case here.

    DeepCloud have an MPP system in Beta that solves these issues and is 100's of times faster.

    We are looking for interested parties to participate in our free beta program.

    See: http://www.deepcloud.co

Page 1 of 1 (14 items)