• SQL Server Cast

    Open Windows Firewall Ports for SQL Server the Easy Way

    • 12 Comments

    After installing a new SQL Server 2008 instance on my Windows 7 laptop I find myself unable to logon locally into the instance by using SQL Server Management Studio (SSMS). I know that due to the changes introduced in Windows Server 2008 and Windows Vista Firewall this was possibly due to an open port missing.

    The Configuring the Windows Firewall to Allow SQL Server Access Books Online entry contains the information to open the required ports in Windows Firewall for each one of the SQL Server services. However, I recently discover a much easier way to open the required Windows Firewall ports for SQL Server on the following Knowledge Base Article (KB968872):

    25082009A

    This article includes the Microsoft "Fix it" troubleshooter that allows to quickly fix the related problem using an easy-to-follow setup. The "Fix it" button is designed for Windows Server 2008 but the script that is executed by the setup is included on the same KB article and can be executed on Windows Vista or Windows 7 as well:

    @echo =========  SQL Server Ports  ===================
    @echo Enabling SQLServer default instance port 1433
    netsh firewall set portopening TCP 1433 "SQLServer"
    @echo Enabling Dedicated Admin Connection port 1434
    netsh firewall set portopening TCP 1434 "SQL Admin Connection"
    @echo Enabling conventional SQL Server Service Broker port 4022 
    netsh firewall set portopening TCP 4022 "SQL Service Broker"
    @echo Enabling Transact-SQL Debugger/RPC port 135
    netsh firewall set portopening TCP 135 "SQL Debugger/RPC"
    @echo =========  Analysis Services Ports  ==============
    @echo Enabling SSAS Default Instance port 2383
    netsh firewall set portopening TCP 2383 "Analysis Services"
    @echo Enabling SQL Server Browser Service port 2382
    netsh firewall set portopening TCP 2382 "SQL Browser"
    @echo =========  Misc Applications  ==============
    @echo Enabling HTTP port 80
    netsh firewall set portopening TCP 80 "HTTP"
    @echo Enabling SSL port 443
    netsh firewall set portopening TCP 443 "SSL"
    @echo Enabling port for SQL Server Browser Service's 'Browse' Button
    netsh firewall set portopening UDP 1434 "SQL Browser"
    @echo Allowing multicast broadcast response on UDP (Browser Service Enumerations OK)
    netsh firewall set multicastbroadcastresponse ENABLE

     

    Executing this on Windows 7 will show several warnings since the netsh firewall command in deprecated on this Windows version (netsh advfirewall firewall is the recommended method) but the script will perform its function nevertheless.

    Kudos for the Microsoft "Fix it" team!

  • SQL Server Cast

    Synchronize two tables using SQL Server Integration Services (SSIS)–Part I of II

    • 7 Comments

    There are situations where a SQL Server DBA needs to keep two different tables in sync. Usually, the DBA needs to keep a copy of a table in a in a data warehouse repository that is used as a solution for archiving and/or reporting.

    SQL Server provides a robust method for keeping data synchronized across databases using Replication but there are situations when all we need is just to keep an online copy of a single table for archiving or reporting purposes and we would prefer to do not tinker into SQL Server replication.

    This post is divided into two parts: Part I explains how to update a destination table with the information that is added into a source table while Part II explains how to replicate any change that happens in existing records in the source table into destination. Please, take into account this is just another way of doing this, there are many other possibilities but the one listed here is probably one of the fastest to implement and modify on-demand.

    This procedure is based on the following scenario: A “table A” on “database A” is periodically replicated using SSIS into “table B” on “database B”. “Table A” is updated with new records and we need to copy those records into “table B” too. Our final implementation will look like this in SQL Server Business Intelligence Development Studio:

    image

     

    Let’s see how this works:

    1. “Source Table” is “table A” in “database A” while “Dest Table” is destination “table B” in “database B”. We start creating two different OLEDB connections in the Data Flow component in SSIS using both the source and destination table as data sources.

    2. We need to perform a JOIN operation on the two different data sources to copy the information we need from one table into the other. For this JOIN to work correctly, the data has to be sorted; this is described in the following MSDN link:

    In Integration Services, the Merge and Merge Join transformations require sorted data for their inputs. The input data must be sorted physically, and sort options must be set on the outputs and the output columns in the source or in the upstream transformation. If the sort options indicate that the data is sorted, but the data is not actually sorted, the results of the merge or merge join operation are unpredictable.

    The “Merge Join” operation is where we filter the data that has been added in the source table (the one we need) and the data that has not been added (the one we do not need). In our case the source table (on the left) contains all the columns we want to copy while on the destination table (on the right) contains only the record that corresponds to the Primary Key column “No_”. Here is the description of this task:

    image

    Here is the key part of the process: the Left Outer Join retrieves all the records in the source table but those that do not exist on the destination table are retrieved as NULLs in the Join Key column “No_”. This is also described in the product documentation:

    To include all products, regardless of whether a review has been written for one, use an ISO left outer join. The following is the query:

    USE AdventureWorks2008R2;
    GO
    SELECT p.Name, pr.ProductReviewID
    FROM Production.Product p
    LEFT OUTER JOIN Production.ProductReview pr
    ON p.ProductID = pr.ProductID

    The LEFT OUTER JOIN includes all rows in the Product table in the results, whether or not there is a match on the ProductID column in the ProductReview table. Notice that in the results where there is no matching product review ID for a product, the row contains a null value in the ProductReviewID column.

    3. Then we have to split the data we need from the one we do not need. For this we use a “Conditional Split” task that takes care of saving the information for those records where the Join Key “No_” was NULL, i.e. saving the information only of the new records. Here is a description of this conditional split task:

    image

    4. Finally we INSERT the resultant data from the Split Conditional task into the destination table, which is the same we use as the source table at the very beginning.

    In this example the Join task is configured so you can reuse the tasks as many times as you want, the records on the destination table will not be duplicated with the information on the source table, only the new records will be populated into the destination table.

  • SQL Server Cast

    “SQLBrowser start request failed” Error Installing SQL Server 2008

    • 6 Comments

    I worked last day on an very tricky SQL Server 2008 setup problem and think it is worth the time sharing here what I discovered. In this case my customer was trying to install SQL Server 2008 on a Windows XP SP3 computer to support a locally installed ERP software package.

    The SQL Server 2008 setup was working fine at the beginning but failing in the middle of the process with the following error message:

    SQLBrowser start request failed. Click Retry to retry the failed action, or click Cancel to cancel this action and continue setup

    Clicking on the Retry button did retry the start operation but failed again with the same error so we had no other option but cancel the setup process.

    Checking the Windows System Event log did not provide any hint about the root-cause of the problem so I decide to check for the SQL Server 2008 setup log that can be found in C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log folder; there are different files included in this folder and each one of them provide information about each one of the components installed (SSIS, AS, Engine, Client Tools, etc.). The Detail.txt log file, as the name implies, includes very detailed information of the the overall setup process and it is usually the first thing I look into when dealing with setup issues; here is what we found in our case:

    2009-05-13 17:25:13 SQLBrowser: The last attempted operation: Starting the SQL Server Browser
    service 'SQLBrowser', and waiting for up to '900' seconds for the process to complete.
    2009-05-13 17:25:13 Slp: Prompting user if they want to retry this action due to the following failure:
    2009-05-13 17:25:13 Slp: --------------------------------------
    2009-05-13 17:25:13 Slp: The following is an exception stack listing the exceptions in outermost to innermost order
    2009-05-13 17:25:13 Slp: Inner exceptions are being indented
    2009-05-13 17:25:13 Slp:
    2009-05-13 17:25:13 Slp: Exception type: Microsoft.SqlServer.Configuration.Sco.ScoException 2009-05-13 17:25:13 Slp: Message:
    2009-05-13 17:25:13 Slp: Service ‘SQLBrowser’ start request failed.
    2009-05-13 17:25:13 Slp: Data:
    2009-05-13 17:25:13 Slp: Feature = SQL_Browser_Redist_SqlBrowser_Cpu32
    2009-05-13 17:25:13 Slp: Timing = Startup
    2009-05-13 17:25:13 Slp: DisableRetry = true

    So I could see that indeed the SQLBrowser service was being created but could not be started for some reason. I decided then to check the Summary.txt setup log file for a more comprehensive list of what components were and were not installed successfully; this file can be also located under the \LOG folder and contains a brief summary of the actions performed by the SQL Server setup. Interestingly enough, the error on this summary setup log file was not pointing to the Browser service but to MSXML 6.0 (Microsoft XML engine):

    Detailed results:
    Feature:                       Database Services
    Status:                        Failure                    
    MSI status:                    Failure
      MSI error code:                0x5EBE5729
      MSI log file location:         C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20090514_170659\Msxml6_Cpu32_1.log

    So I looked up into this setup log file (Msxml6_Cpu32_1.log):

    MSI (s) (30:58) [17:22:37:661]: Note: 1: 1708
    MSI (s) (30:58) [17:22:37:661]: Product: MSXML 6.0 Parser (KB933579) – Installation error.

    MSI (s) (30:58) [17:22:37:661]: Windows Installer installed the product. Product Name: MSXML 6.0 Parser (KB933579). Product Version: 6.10.1200.0. Product Language: 3082. Installation success or error status: 1603.

    MSI (s) (30:58) [17:22:37:661]: Cleaning up uninstalled install packages, if any exist
    MSI (s) (30:58) [17:22:37:661]: MainEngineThread is returning 1603
    MSI (s) (30:68) [17:22:37:770]: No System Restore sequence number for this installation.

    As you can see, SQL Server Browser was not the only component failing during the setup, the MSXML 6.0 engine was failing too. Comparing time from both Detail.txt and Msxml6_Cpu32_1.log files I could see the later was newer; in other words, the MSXML setup error was happening before the SQL Server Browser error. I went back to the Detail.txt file to confirm this point:

    2009-05-13 17:22:36 Slp: Running Action: Install_Msxml6_Cpu32_Action
    2009-05-13 17:22:36 Slp: Target package: "D:\x86\setup\x86\msxml6.msi"
    2009-05-13 17:22:37 Slp: InstallPackage: MsiInstallProduct returned the result code 1603.
    2009-05-13 17:22:38 Slp: Sco: Attempting to write hklm registry key SOFTWARE\Microsoft\Microsoft SQL Server to file C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20090514_170659\Registry_SOFTWARE_Microsoft_Microsoft SQL Server.reg_
    2009-05-13 17:22:38 Slp: Sco: Attempting to write hklm registry key SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall to file C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20090514_170659\Registry_SOFTWARE_Microsoft_Windows_CurrentVersion_Uninstall.reg_
    2009-05-13 17:22:38 Slp: Sco: Attempting to write hklm registry key SOFTWARE\Microsoft\MSSQLServer to file C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20090514_170659\Registry_SOFTWARE_Microsoft_MSSQLServer.reg_
    2009-05-13 17:22:43 Slp:
    2009-05-13 17:22:43 Slp: Watson bucket for Msi based failure has been created
    2009-05-13 17:22:43 Slp: No retry-able MSI return code detected.
    2009-05-13 17:22:43 Slp: Checkpoint: INSTALL_MSXML6_CPU32_ACTION
    2009-05-13 17:22:43 Slp: Completed Action: Install_Msxml6_Cpu32_Action, returned False
    2009-05-13 17:22:43 Slp: Error: Action "Install_Msxml6_Cpu32_Action" failed during execution.

    Checking into Add/Remove Programs in Control Panel I could see the MSXML 6.0 engine listed as an installed. This is one of the shared component installed by SQL Server 2008 but in my case the customer was unsure if this component was previously installed on the Windows XP computer. I decided to remove the MSXML 6.0 engine from the PC using the Add/Remove Programs Control Panel applet and try the SQL Server 2008 setup again. This time the installation completed with no errors.

    In this case the initial SQL Server Browser error was misleading because it was the first error that was shown by the Setup GUI. Only by looking inside the SQL Server setup log files we were able to discover the MSXML 6.0 installation issue the took place before.

    Another possible way of tracing down this error is by looking into the Watson Bucket log file that exists on the same setup “Log” directory:

    In our case this was the content of this file:

    Watson bucket data:
      Bucket param 1: 10.0.1600.22
      Bucket param 2: 6.10.1200.0
      Bucket param 3: msxml6.msi
      Bucket param 4: 0x2D2816FE
      Bucket param 5: 0x5EBE5729
      Bucket param 6: Install_Msxml6_Cpu32_Action
      Bucket param 7:
      Bucket param 8:
      Bucket param 9:
      Bucket param 10:

    I have reviewed the SQL Server 2008 SP1 notes and this issue has been solved there so another way to avoid this problem would be by installing a slipstreamed SQL Server 2008 media, slipstreaming is a new SQL Server functionality that allows the database administrator to integrate Service Packs into the installation media.

  • SQL Server Cast

    How to check Fragmentation on SQL Server 2005

    • 5 Comments

    I have been asked this question several times: Is there a tool in SQL Server where I can easily check database fragmentation? The answer, as with many other things, would be "Yes" and "No".

    Fragmentation is a "natural" process in the database lifecycle. If the database is frequently updated via INSERT, UPDATE, or DELETE statements we can expect it to become fragmented over the time. If database indexes are fragmented, SQL Server query optimizer will take not-so-optimal decisions when using and index to resolve a query, affecting the overall query performance. SQL Server provides tools to check and fix database fragmentation but we need first to understand how to use these tools. This is something just a little bit more difficult than simply firing up Windows defrag.exe to check for filesystem fragmentation.

    Let's start with some theory so we can better understand what fragmentation is and how it affects database performance. There are two different types of fragmentation in SQL Server: Internal and External. Internal fragmentation is the result of index pages taking up more space than needed. It is like having a book where some of the pages are left blank; we do not know what pages are blank until we read the entire book and same applies for SQL Server, which has to read all the pages in the index wasting extra-time and server resources in the empty pages. External fragmentation occurs when the pages are not contiguous on the index. Following the book analogy, it is like having a book where pages are not ordered in a logical way (page 1, then page 2, then page 3 and so on) causing you to go back and forward to compound the information and make sense of the reading. Heavily used tables that contains fragmented indexes will impact your database performance. If you are still unsure about what external and internal fragmentation means, refer to this article published on SQL Server Magazine by Kalen Delaney (although written for SQL Server 2000, definitions are still valid for SQL Server 2005).

    In our minds we associate fragmentation to something bad that should be avoided at any cost but, is this always the case? Not with internal fragmentation. If your tables are frequently changed via UPDATE and INSERT operations, having a small amount of free space on the index or data pages (having a small amount of internal fragmentation) will cause a new page addition (page split) in order to allocate that new data. This leads ultimately to external fragmentation since the new added data page won't be probably adjacent to the original page. Internal fragmentation, therefore, can be desirable at low levels in order to avoid frequent page split, while external fragmentation, however, should always be avoided. Please understand that by 'low levels' I simply mean 'low levels'. The amount of free space that can be reserved on a index can be controlled using the Fill Factor.

    It is also important to understand that by external fragmentation we do not mean filesystem fragmentation or disk fragmentation.

    SQL Server 2005 introduces a new DMV (Dynamic Management View) to check index fragmentation levels: sys.dm_db_index_physical_stats. Although SQL Server 2005 still supports the SQL Server 2000 DBCC SHOWCONTING command, this feature will be removed on a future version of SQL Server. Here you can check the differences between both instructions when checking for fragmentation on the HumanResources.Employee table in the sample database AdventureWorks:

    • DBCC SWOWCONTING example:

    USE AdventureWorks;
    GO
    DBCC SHOWCONTIG ('HumanResources.Employee')
    GO

    DBCC SHOWCONTIG scanning 'Employee' table...
    Table: 'Employee' (869578136); index ID: 1, database ID: 6
    TABLE level scan performed.
    - Pages Scanned................................: 7
    - Extents Scanned..............................: 3
    - Extent Switches..............................: 2
    - Avg. Pages per Extent........................: 2.3
    - Scan Density [Best Count:Actual Count].......: 33.33% [1:3]
    - Logical Scan Fragmentation ..................: 14.29%
    - Extent Scan Fragmentation ...................: 33.33%
    - Avg. Bytes Free per Page.....................: 172.6
    - Avg. Page Density (full).....................: 97.87%

    • sys.dm_db_index_physical_stats DMV example:

    USE AdventureWorks
    GO
    SELECT object_id, index_id, avg_fragmentation_in_percent, page_count
    FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'), OBJECT_ID('HumanResources.Employee'), NULL, NULL, NULL);

    In this last example I have selected only relevant information to show from the DMV, you will see that DMV can provide much more details about the index structure. In case you wanted to show fragmentation details for all the objects in the AdventureWorks database, the command would be as follows:

    SELECT *
    FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'), NULL, NULL, NULL , NULL);

    Please, refer to SQL Server 2005 Books Online for more information on sys.dm_db_index_physical_stats syntax.

    How do we know if our database is fragmented? We have to pay attention to the avg_fragmentation_in_percent value. A value between 5-30% indicates moderate fragmentation, while any value over 30% indicates high fragmentation (book pages missing any order).

    The avg_page_space_used_in_percent is another value that it is worth to look closely. This value represent the amount of spaced used in the indexes. A value below 75% is usually associated to internal fragmentation (more blank pages on our book than recommended).

    In order to reduce fragmentation we will have to reorganize or rebuild the indexes. Choosing between reorganizing or rebuilding depends on the resultant values. For moderate fragmentation index reorganization will be enough, for heavily fragmented indexes a rebuild process is needed. The following table summarizes when to use each one (refer to this Books Online article for an in-depth coverage of this information):

    Reference Values (in %) Action SQL statement

    avg_fragmentation_in_percent > 5 AND < 30

    Reorganize Index ALTER INDEX REORGANIZE

    avg_fragmentation_in_percent > 30

    Rebuild Index ALTER INDEX REBUILD

    REORGANIZE statement is always executed online while REBUILD index is executed offline by default and can be optionally executed while database is accessed by users with the ONLINE statement, which can be a plus for 24x7 environments. This approach, which is only available in SQL Server 2005, has some limitations; refer to the ALTER INDEX statement in SQL Sever Books Online for more details.

    If you are looking for an easy way to automate these processes the SQL Server Books Online reference for the sys.dm_db_index_physical_stats contains a sample script you can implements within minutes. This script will take care of reorganizing any index where avg_fragmentation_in_percent is below 30% and rebuilding any index where this values is over 30% (you can change this parameters for your specific needs). Add a new SQL Server Execute T-SQL statement task to your weekly or daily maintenance plan containing this script so you can keep you database fragmentation at optimum level.

  • SQL Server Cast

    “OnlineThread: SQL Cluster shared data upgrade failed” error (ID 19019) installing Cumulative Update in SQL Server cluster

    • 5 Comments

    Last week I was working with a customer on this obscure SQL Server error right after installing Cumulative Update 9 (CU9) on a SQL Server 2008 two-node cluster. As you will see, the root-cause of the problem was easy to understand, but getting to the point of finding this root-cause was not.

    In this case the customer was running a virtual instance of SQL Server 2008 SP1; this instance was running with no errors and moving back and forth the clustered resources was not causing any problem. The customer was installing CU9 to avoid a database issue that was affecting the performance of a CRM system. Following best practices in SQL Server 2008/R2, she was installing the CU in the passive node first, in order to minimize the amount of downtime in the production CRM system. The problem was that trying to start the SQL Server service on the updated cluster node was failing immediately.

    We looked into the Windows Event log and Cluster log but the information from these two sources did not provide any clue:

    Application Event log:

    10/09/2010 08:08:15 PM  Error         NODE1 19019   MSSQL$SQL2008                    Failover        N/A                                [sqsrvres] OnlineThread: SQL Cluster shared data upgrade failed       (status 0, Worker retval = 3) 
    10/09/2010 08:08:11 PM  Error         NODE1 19019   MSSQL$SQL2008                    Failover        N/A                                [sqsrvres] OnlineThread: SQL Cluster shared data upgrade failed       (status 0, Worker retval = 3) 
    10/09/2010 08:08:10 PM  Error         NODE1 19019   MSSQL$SQL2008                    Failover        N/A                                [sqsrvres] OnlineThread: SQL Cluster shared data upgrade failed       (status 0, Worker retval = 3) 
    10/09/2010 07:56:26 PM  Error         NODE1 19019   MSSQL$SQL2008                    Failover        N/A                                [sqsrvres] OnlineThread: SQL Cluster shared data upgrade failed       (status 0, Worker retval = 3) 
    10/09/2010 07:55:26 PM  Error         NODE1 19019   MSSQL$SQL2008                    Failover        N/A                                [sqsrvres] OnlineThread: SQL Cluster shared data upgrade failed       (status 0, Worker retval = 3)
     

    Cluster log:

    00000d78.000010dc::2010/09/14-16:50:34.060 ERR   [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] OnlineThread: SQL Cluster shared data upgrade failed                            (status 0, Worker retval = 3)
    00000d78.000010dc::2010/09/14-16:50:34.060 ERR   [RHS] Online for resource SQL Server (SQL2008) failed.
    000009dc.000011cc::2010/09/14-16:50:34.060 INFO  [RCM] HandleMonitorReply: ONLINERESOURCE for 'SQL Server (SQL2008)', gen(0) result 5018.
    000009dc.000011cc::2010/09/14-16:50:34.060 INFO  [RCM] TransitionToState(SQL Server (SQL2008)) OnlinePending-->ProcessingFailure.
    000009dc.00001168::2010/09/14-16:50:34.060 ERR   [RCM] rcm::RcmResource::HandleFailure: (SQL Server (SQL2008))
    000009dc.00001168::2010/09/14-16:50:34.060 INFO  [RCM] resource SQL Server (SQL2008): failure count: 1, restartAction: 2.

    The error was preventing SQL Server to create and write any ERRORLOG file so we couldn’t rely in this log either. In these kind of situations Sysinternals Process Monitor (Procmon) is usually a good option so we run this utility while reproducing the service start failure. After capturing Procmon, we filtered the trace by the rhs.exe process which is the Resource Health Check process responsible of loading SQSRVRES.DLL running the IsAlive check. Through this test we could see this DLL was detecting that the SQL Server component required an updated because the PatchLevel and SharedDataPatchLevel on Windows Registry were different. The error was taking place during this update:

    image

    So Procmon was useful telling us what was failing but not telling us why it was failing. Working with my colleague João Loureiro I learnt that by modifying the following registry key to “1” (one), the Windows cluster.log file could provide a more verbose output:

    HKLM\Cluster\Resources\<ID for SQL Server Instance>\Parameters\VerboseLogging

    After modifying this key, we failed the updated node again and check one more time to the cluster.log file, this time the information in the file was the following:

    000014c8.00001eb0::2010/10/06-16:34:54.991 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] OnlineThread: enter; VirtualServerName=NETNAME; InstanceName=SQL2008
    000014c8.00001eb0::2010/10/06-16:34:54.991 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] OnlineThread: ServiceName=MSSQL$SQL2008
    000014c8.00001eb0::2010/10/06-16:34:55.022 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] ClusterResourceControl, found the network name: NETNAME.
    000014c8.00001eb0::2010/10/06-16:34:55.022 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] OnlineThread: ServerName=NETNAME\SQL2008
    000014c8.00001eb0::2010/10/06-16:34:55.054 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): Calling SQLClusterResourceWorker::WaitForCompletion (200)
    000014c8.000017f0::2010/10/06-16:34:55.054 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): Worker thread starting ...
    000014c8.000017f0::2010/10/06-16:34:55.054 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): Entering SQLClusterSharedDataUpgradeWorker thread.
    000014c8.000017f0::2010/10/06-16:34:55.054
    INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): SqlDataRoot = 'R:\MSSQL10.SQL2008\MSSQL',         CurrentPatchLevel = '10.1.2789.0',         SharedDataPatchLevel = '10.1.2531.0'
    000014c8.000017f0::2010/10/06-16:34:55.054 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): FTDataPath = 'R:\MSSQL10.SQL2008\MSSQL\FTData',         SQLGroup = 'S-1-5-80-2586167408-2258694634-1203600018-1002233561-722790735'
    000014c8.000017f0::2010/10/06-16:34:55.054 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0):
    Entering DoSQLDataRootApplyACL (R:\MSSQL10.SQL2008\MSSQL\Data).
    000014c8.000017f0::2010/10/06-16:34:55.054 WARN  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): DoSQLDataRootApplyACL : Failed to create directory tree at SQLDataRoot.
    000014c8.000017f0::2010/10/06-16:34:55.069 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): Exiting DoSQLDataRootApplyACL (3).
    000014c8.000017f0::2010/10/06-16:34:55.069 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): Exiting SQLClusterSharedDataUpgradeWorker thread (3).
    000014c8.000017f0::2010/10/06-16:34:55.069 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): Worker thread exiting (retval = 3) ...
    000014c8.00001eb0::2010/10/06-16:34:55.069 INFO  [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] Worker Thread (14FB7D0): Calling SQLClusterResourceWorker::WaitForCompletion (4294967295)
    000014c8.00001eb0::2010/10/06-16:34:55.069 ERR   [RES] SQL Server <SQL Server (SQL2008)>: [sqsrvres] OnlineThread: SQL Cluster shared data upgrade failed       (status 0, Worker retval = 3)
    000014c8.00001eb0::2010/10/06-16:34:55.069 ERR   [RHS] Online for resource SQL Server (SQL2008) failed.

    Here we could see much more detailed information. After reviewing this log with the help of the customer we did find that the R: drive was indeed non-existent on the cluster. Only then we were able to understand why the SQLClusterSharedDataUpgradeWorker thread was failing. Checking the following registry key:

    HKLM\Software\Microsoft\Microsoft SQL Server\MSSQL10.SQL2008\Setup\SQLDataRoot

    we found drive R: listed instead of the actual SQL Server Data root drive. From this point fixing the problem was a matter of changing this registry key one cluster node at a time right after installing the required CU. It is clear than the problem was not related to this CU9 specifically, using any other CU or SP would have probably lead to a similar situation.

    Neither the customer nor us were able to understand how this path could have been changed. We can only imagine some ALTER DATABASE statement was run at certain point, leaving a wrong registry entry for the SQLDataRoot key.

  • SQL Server Cast

    “The IP Address 'x.x.x.x' is already in use. To continue, specify a different IP address” SQL Server 2008 cluster setup error

    • 3 Comments

    Today I want to talk about my experience with one of the most strange setup errors installing SQL Server in a cluster. In this particular case I was helping the customer to install a new SQL Server 2008 R2 instance in a three-node cluster. The Windows cluster was already running with two instances, each one on a different cluster node and the customer wanted to install this third instance on the third cluster node. The rest of the instances on the other cluster nodes have been running for a while with no issues.

    The customer was trying to install the third cluster node but the setup was consistently failing with the following error:

    SQL Server Setup has encountered the following error: The IP Address ‘10.246.18.118’ is already in use. To continue, specify a different IP address. Error code 0x84B40000.

    image

    The initial question, “is there any other machine with that IP address on the network?”, was quickly answered: neither PING nor NSLOOKUP shown any other host owning that IP address. As usual with setup problems I looked into the setup log file for SQL Server. The “Summary.txt” file had the same error reported by the GUI:

    Exception type: Microsoft.SqlServer.Chainer.Infrastructure.InputSettingValidationException
        Message:
            The IP Address '10.246.18.118' is already in use. To continue, specify a different IP address
    .

    The “Detail.txt” setup log file had more information. We were able to see that indeed the IP address 10.246.18.118 did not exist on the network during the initial setup phase:

    2010-11-12 14:53:09 Slp: IP Addresses have been specified so no defaults will be generated.
    2010-11-12 14:53:38 Slp: SendARP didn't return a MAC address for IP address '10.246.18.118'.  The message was 'The network name cannot be found.'.  This indicates the address is valid to create.

    […]

    2010-11-12 14:59:33 Slp: SendARP didn't return a MAC address for IP address '10.246.18.118'.  The message was 'The network name cannot be found.'.  This indicates the address is valid to create.

    … but all of a sudden, the ARP request succeeded in finding a valid host with that same IP address, causing the setup to halt:

    2010-11-12 15:00:25 Slp: SendARP for IP Address '10.246.18.118' succeeded.  The found MAC address is '00:00:5e:00:01:65'.  The IP address is already in use.  Pick another IP address to continue.
    2010-11-12 15:00:28 Slp: SendARP didn't return a MAC address for IP address '10.246.16.118'.  The message was 'The network name cannot be found.'.  This indicates the address is valid to create.
    2010-11-12 15:00:28 Slp: Hosting object: Microsoft.SqlServer.Configuration.ClusterConfiguration.ClusterIPAddressPrivateConfigObject failed validation
    2010-11-12 15:00:28 Slp: Validation for setting 'FAILOVERCLUSTERIPADDRESSES' failed. Error message: The IP Address '10.246.18.118' is already in use. To continue, specify a different IP address.
    2010-11-12 15:00:28 Slp: Error: Action "Microsoft.SqlServer.Configuration.SetupExtension.ValidateFeatureSettingsAction" threw an exception during execution.
    2010-11-12 15:00:28 Slp: Microsoft.SqlServer.Setup.Chainer.Workflow.ActionExecutionException: The IP Address '10.246.18.118' is already in use. To continue, specify a different IP address. ---> Microsoft.SqlServer.Chainer.Infrastructure.InputSettingValidationException: The IP Address '10.246.18.118' is already in use. To continue, specify a different IP address. ---> Microsoft.SqlServer.Chainer.Infrastructure.InputSettingValidationException: The IP Address '10.246.18.118' is already in use. To continue, specify a different IP address.
    2010-11-12 15:00:28 Slp:    --- End of inner exception stack trace ---

    […]

    2010-11-12 15:05:26 Slp: Error result: -2068578304
    2010-11-12 15:05:26 Slp: Result facility code: 1204
    2010-11-12 15:05:26 Slp: Result error code: 0

    In SQL Server 2008 a cluster installation is divided into two main phases; the first one takes care of copying the instance files into the target machine and register all the components while the second one takes care of creating the clustered resources. In our case the setup was failing at the very end of the setup where the cluster resources are created. As you can see, the “Detail.txt” file was also pointing to the MAC address of the offending host.

    As you may know, ARP is a network protocol that takes care of resolving IP addresses based on physical or MAC addresses. This information is stored in memory into the ARP table. The command prompt ARP -a instruction checks the information stored by Windows on the ARP cache table. After uninstalling one more time the components left by the failed setup, we did a quick ARP check based on what the setup was doing and found the following information:

    image

    The IP addresses 10.246.16.100 and 10.246.18.100 were the two virtual IP addresses that we were trying to configure on the cluster. I am far for being an expert in networking but having an IP address under the x.x.18.x/24 network in the x.x.16.x/24 interface was strange enough to make me think in a network resolution problem. The IP address and MAC address was in fact the same that was causing the setup to fail. We tried to PING and NSLOOKUP again the same IP address but nothing come back.

    I double-checked with one of the SQL Server 2008 R2 clusters in my lab and found that every network address was correctly shown under its corresponding network interface:

    clip_image002

    Strange enough the “bogus” ARP entry was created as dynamic in the case of the customer’s cluster so we were expecting that to be removed from the ARP cache table several seconds, but this was not happening. I was not sure what was announcing the IP address via ARP on the network, I can only think in a problem with the NIC teaming but we did run out of time and were not able to test this hypothesis. After removing the offending ARP entry with ARP –d we run a new setup that finished successfully this time. If you have a similar experience with this error or have any idea of where that ARP entry could come from please, let me know.

  • SQL Server Cast

    SSIS error DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER when connecting to Oracle data source

    • 3 Comments

    This time I want to discuss a support case that took me some days to get fixed. This case may be of interest for anyone having problems connecting a SQL Server installation to an Oracle data source using SQL Server Integration Services (SSIS).

    In this case my customer was trying to connect to an Oracle Server in order to copy several tables into SQL Server 2005. SQL Server Integration Services is a great tool for this purpose, you can configure how the copy job takes place using SQL Server Business Intelligence Development Studio (aka BIDS) and check from an user-friendly interface how each step is executed. BIDS is in fact a version of Visual Studio 2005 (the executable name is devenv.exe) with the components required to work with the different SQL Server services and components.

    My customer was executing BIDS from her own computer without issues but was finding a problem while working with BIDS from the server where SQL Server was installed. The errors she was getting were:

    [Connection manager "mib.bigcorp.com.gca"] Error: SSIS Error Code DTS_E_OLEDBERROR. 
    An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available.
    Source: "Microsoft OLE DB Provider for Oracle"  Hresult: 0x80004005
    Description: "ORA-06413: Connection not open.".

    [OLE DB Source [1]] Error:
    SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. 
    The AcquireConnection method call to the connection manager "mib.bigcorp.com.gca"
    failed with error code 0xC0202009. 
    There may be error messages posted before this with more information on why the
    AcquireConnection method call failed.

    Since the customer was using the Microsoft OLE DB provider for Oracle to setup the connection, I did a research on the available documentation in Microsoft Support website using the terms "oracle", "oledb" and "provider". I did find several documents discussing different problems but one of them was of special interest here:

    INFO: Limitations of Microsoft Oracle ODBC Driver and OLEDB Provider (KB244661)

    SUMMARY: This article describes the limitations of the current version of Microsoft ODBC Driver and OLE DB Provider for Oracle. These components are currently in maintenance mode. No new updates are planned for future releases of these components including their support against versions later than Oracle 8i.

    According to this article, it is not recommended to make use of the Microsoft OLE DB Provider for Oracle if the Oracle server is a version beyond 8i (my customer was using version 9i). This does not mean that the Microsoft OLE DB Provider did not work at all (indeed I was able to preview the remote table in the Oracle server and test connectivity successfully from within SSIS using this provider) but the functionality offered will be reduced and there was a good chance some incompatibility problem happened.

    I asked the customer to install the Oracle client on the server (different versions of the Oracle client software are available here). This client provides the required OLE DB Provider from Oracle that enables connectivity to an Oracle data source. Only some hours later the customer back to me saying "I have installed the Oracle client, but the only Oracle OLE DB component I am able to select in SSIS is the Microsoft one". Ok, I knew that one from a different support incident I had in the past. SQL Server Integration Services runs as a 32-bit application on a 64-bit server, as you can quickly check using Windows Task Manager:

     

    This means that, unless you install the Oracle Client for 32-bit on the 64-bit server, you won't be able to access the Oracle OLE DB Provider. This applies not only for the Oracle OLE DB, but for any other 32-bit provider. You can read a full explanation about why you need the 32-bit components in your 64-bit server here:

    The 32-bit SSIS Designer [BIDS] displays only 32-bit providers that are installed on the local computer. To configure a connection manager to use a specific 64-bit provider, you must install the 32-bit version of the provider on the development computer for use at design time. Even though the 32-bit version of the provider is installed, you can still run the package in 64-bit mode, both in the development environment and after deployment. The 32-bit and 64-bit versions of a provider have the same ID. Therefore, the SSIS runtime will select the appropriate version of the provider to use.

    But having the 32-bit component is not enough, we need both the 32-bit and 64-bit components installed side-by-side. Wit only the 32-bit Oracle client the package will fail when executing from BIDS, this is because there is only one SSIS service on a server, and it is indeed a 64-bit process when running on a 64-bit server as we can see on this Process Explorer screen capture (process name is MsDtsSrvr.exe):

     

     

    Ok, we have now the SSIS package using the Oracle OLE DB provider from Oracle and we have verified that both the 32-bit and 64-bit version of the Oracle client are installed on the Windows server.

    However, the package kept failing with the same error.

    After several hours tracing down the error with the help of Sysinternals Process Explorer, Process Monitor and the Internet we did find the a possible cause for this error. Apparently some Oracle client versions are affected by a bug (Oracle bug number 3807408) that causes the client networking layer to incorrectly parse program locations that contain parenthesis in the path. You can find a brief discussion of this behavior on this thread in Oracle forums.

    As you probably know, on a Windows Server x64 installation, the default program files directory is "C:\Program Files\" but all the 32-bit components are installed in "C:\Program Files (x86)\" directory. Since my customer did not have access to Oracle support at that time, we decide to test if this was the root cause of our problem copying the contents of the "C:\Program Files (x86)\Microsoft Visual Studio 8" folder under a new Program Files folder called "C:\ProgramFilesx86". We started BIDS (devenv.exe) from the new directory and executed the package both from BIDS and SQL Server, this time successfully.

    The trick finally did the job and my customer was able to successfully import the required tables into SQL Sever, however, I do not think this is a good solutions (much less "elegant" solution) and I recommended her to contact Oracle support for a version of the Oracle client no affected by this bug.

  • SQL Server Cast

    CA eTrust Antivirus reports infection by JS/Snz.A on sqlcast blog!

    • 1 Comments

    Update (1st. January): CA has released a new eTrust definitions update that corrects the error (build 5421), apparently this was a false positive; thanks to a reader for reporting on this. No word from CA so far (issue number 16683491 01).

    I have discovered the following virus detection message appearing on Windows computers using CA eTrust antivirus software:

     

     

    This problem seems to appear using the latest eTrust antivirus update (version 31.3, build 5417).

    I have made a brand new installation of my PC with Windows Vista and eTrust only, updating both the OS and antivirus software with the latest updates. I have found that the virus detection pop-up appears whenever I browse to this blog web address.

    A quick search on the Web talks about false positives being reported by CA eTrust antivirus software on some javascript code but I have not been able to find a trusted source of information about this so far. CA does not currently report any information about this virus detection on the support web site.

    I have run two different online antivirus scanners, Windows Live OneCare and Trend Micro HouseCall and none have reported an infection.

    I have contacted CA Technical Support requesting more information about this issue. I will update the blog with a new post as soon as I have more information.

  • SQL Server Cast

    Understanding the value of the Enterprise Edition with simple examples

    • 1 Comments

    This time I just wanted to announce that my colleague Ignacio Alonso Portillo (aka "Nacho") and myself have plans to start posting some simple examples to demonstrate, from a practical standpoint, the huge benefits provided by all the features which are only available with the Enterprise Edition of SQL Server 2005.

    Recently, we have been, more and more, hearing SQL Server users saying: "Why would I want to choose the Enterprise Edition in SQL Server 2005? It sounds to me that from previous versions of the product, the only benefit I got from the Enterprise edition, compared to the Standard, was that it was cluster aware, and now in SQL Server 2005, the Standard edition also supports failover clustering, therefore I don't need to "waste" any extra money in buying an Enterprise edition license."

    To be honest, that assertion knocks me out. :-P

    First of all, it isn't true that failover clustering support was the only difference between Standard and Enterprise editions in previous versions. There was much more than that. But we won't touch the SQL Server 2000 topic nowadays, after over seven years since it was released. That wouldn't be of too much value for you. Instead, we prefer to focus in SQL Server 2005 initially, and will possibly extend it to cover SQL Server 2008 as well.

    In 2005, there is so much extra value in the Enterprise Edition, that we prefer to believe that those users who agree with the statement above, is just because they haven't really been explained what all that value is, which form it adopts within the product, and how they can actually benefit from it.

    For that reason, we feel it's our obligation to show you all these benefits, using simple and clear examples, with the aim of helping you to take better decisions when it comes to choosing the most appropriate edition of SQL Server to satisfy your business requirements.

    So, keep tuned and don't miss any of those deliveries if you want to make the best decisions. ;-)

  • SQL Server Cast

    Reducing tempdb size in a SQL Server Cluster (after ALTER DATABASE execution)

    • 1 Comments

    Some weeks ago a colleague from another company called me for help in fixing a database performance problem. The root cause was located on a possible contention issue on tempdb that was causing database to perform badly from time to time. He did find a match between the times of bad performance and the following messages in the SQL Server ERRORLOG file:

    2007-12-03 09:34:31.07 spid1     SQL Server has encountered 2 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [C:\Program Files\Microsoft SQL Server\MSSQL$YUKON\Data\tempdb.mdf] in database [tempdb] (7).  The OS file handle is 0x00000530.  The offset of the latest long IO is: 0x0000008ef65a00

    2007-12-03 09:34:57.04 spid104   SQL Server has encountered 8186 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [C:\Program Files\Microsoft SQL Server\MSSQL$YUKON\Data\tempdb.mdf] in database [tempdb] (2).  The OS file handle is 0x000005FC.  The offset of the latest long IO is: 0x0000004d242000

    [...]

    They have already moved tempdb SQL Server database to a different system partition as recommended in situations where contention problems exist. They did also changed tempdb size from 250 MB to 2,500 MB, which was the maximum size observed for this database during peak time usage. Although the overall database performance was better now, it was not brilliant and the "IO requests taking longer than 15 seconds" messages continued popping up on ERRORLOG. They wanted now to try splitting tempdb in several files, which is also a recommended procedure in tempdb contention problems.

    Unfortunately the partition where tempdb was moved was only 10 GB in size and he was in the need of splitting tempdb in 8 different files (8 was the number of logical processors on the server). Creating 8 tempdb files of 2,500 MB each was not only impossible in this cased but hard to justify from a logical point of view.

    As you probably know, a database cannot be reduced to a size that is smaller that the current size using a simple ALTER DATABASE ... MODIFY FILE instruction. If you try to change tempdb system database size from a Query window you will see the following error message:

    Msg 5039, Level 16, State 1, Line 1
    MODIFY FILE failed. Specified size is less than current size.

    Surprisingly, when reducing the database size from SQL Server Management Studio, the GUI does not report any error, but the database size is not changed. This seems like a step back from SQL Server 2000 where the same action in Enterprise Manager caused the following error to be raised:

    Error 21335: [SQL-DMO]The new DBFile size must be large than the current size.

    (UPDATE: An in depth explanation of this behavior can be found on this MSDN blog)

    This error message is expected. For data and log files the new size should be larger than the current size. If we want to reduce the database size we have to shrink it and return the freed up space to the operating system (providing the database files can be shrunk). My first though was to point my colleague to Knowledge Base article KB307487 where the shrinking process for tempdb is explained, but while reading the article I discovered the following paragraph:

    "There are limitations for use of the DBCC SHRINKDATABASE command on the tempdb database. The target size for data and log files cannot be smaller than the size specified when the database was created or the last size explicitly set with a file-size changing operation such as ALTER DATABASE with the MODIFY FILE option or the DBCC SHRINKFILE command."

    They indeed had executed and ALTER DATABASE command with the MODIFY FILE option when they changed the original tempdb database size. Shrinking tempdb was not an option in this case. The only option was to execute a new ALTER DATABASE command with a smaller size than the current one, as specified in the "Method 1" section in the KB.

    This requires the SQL Server service to be started in minimal configuration mode with -c and -f startup options, but how to do this in a cluster? For standalone servers we have to execute the following instruction:

    sqlservr -c -f [ -s%InstanceName% ]

    where -s will be a required parameter for a named instance. If we are using a default instance (MSSQLSERVER) -s switch is not needed. If we have installed a SQL Server named instance the instruction to use under a standalone server would be:

    sqlservr -c -f -sMSSQL$YUKON

    where "YUKON" corresponds with the instance name. On a cluster installation, the instance named is compound using the "SQL Server Network Name" cluster resource. In my example the Cluster Administrator shows the SQL Server resources as follows:

    so the instruction to startup the SQL Server service would be:

    sqlservr -c -f -sSQL-2005\YUKON

    This instruction should be executed from a command prompt window (cmd.exe), the SQL Server service will start logging the ERRORLOG information on the console window. Of course, we can also add the -f startup option to the SQL Server service in SQL Server Configuration Manager (2005) or Enterprise Manager (2000) and restart; in this case we do not need to use -c. In SQL Server Configuration Manager we have to go SQL Server properties on Advanced tab, Startup Parameters while on SQL Server Enterprise Manager we go to the instance properties, General tab, Startup Parameters button.

    Note: From a cmd.exe window use the full path to sqlservr.exe adding -c and -f (with hyphen). If you use the NET START command use the SQL Server service name with the /c /f (with slash)

    Once SQL Server service is started, we will be able to change the tempdb database size. You can use sqlcmd command-line utility (or osql if you are running under SQL Server 2000) opening a new cmd.exe windows to perform this change. The following example will configure the initial tempdb database size to 70 MB:

    C:\>sqlcmd -S SQL-2005\YUKON
    1> USE master
    2> GO
    Changed database context to 'master'.
    1> ALTER DATABASE tempdb
    2> MODIFY FILE ( NAME = tempdev, SIZE = 70MB )
    3> GO
    1> exit

    Alternatively you can execute this same ALTER DATABASE instruction from a Query window in SQL Server Management Studio or Enterprise Manager. If you have used the GUI to change the startup options do not forget to remove the -f startup parameter from the SQL Server service option in Configuration Manager or Enterprise Manager once the change is done.

Page 1 of 3 (30 items) 123