Welcome to MSDN Blogs Sign in | Join | Help

Benjamin Wright-Jones

SQL Server Lessons Learned and Notes from the Field (Microsoft Consultancy Services, UK)

News

  • This posting is provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified on Microsoft.com Locations of visitors to this page
New editions of SQL Server 2008 R2 announced
  • SQL Server 2008 R2 Datacenter

  • SQL Server 2008 R2 Parallel Data Warehouse

  • http://www.microsoft.com/sqlserver/2008/en/us/R2-editions.aspx

    Processing Excel files in SSIS 2008 – Lessons Learned

    I have recently been working on a project where we need to load and transform data held in Excel 2007 into a SQL Server database.  The Excel sheets were fairly complex and had different structures.  Fortunately, SSIS 2008 provides some good tools to handle this situation. 

    Lessons learned

    1. It is possible to read the Excel metadata using the mechanism listed in this knowledge base article, http://support.microsoft.com/kb/318452 HOW TO: Retrieve Meta Data from Excel by Using GetOleDbSchemaTable in Visual C# .NET.  You may be wondering why I want to do this.  Handling Excel sheets with different structures can be tricky so reading the metadata can help determine control flow processing i.e. which data flow to use to process the file.

    2. Remember, if you are testing on x64 then your package will not execute if you are using the Excel source since it is not supported on 64-bit so you need to disable the Run64BitRuntime property as below

    image

    3. The Script component is really great when it comes to data manipulation in the data flow.  This can be used as either a source, transformation or destination and allows you to manipulate the rows in the pipeline using either VB.NET or C#.

    4. As mentioned, Excel files can also be read using the Execute SQL task in the control flow, which is a nice feature e.g. SELECT * FROM $PivotData

    5. The File System Task can make file handling a lot easier.  Combine this with the use of variables and property expressions, dynamic file manipulation became a whole lot easier.  For example, after processing I either move the file to another folder based on the outcome e.g. success or failure. 

    Changing the LISTENER_IP address in a database mirroring configuration

    This question came up today: how can I change the IP address between the principal and mirror in database mirroring?

    Not a common operation but this procedure worked in an isolated lab environment where we had full control over the application and transaction activity.  We wanted to introduce a WAN latency injector so needed to change the database mirroring IP addresses on the principal and mirror. 

    1. Stop application activity
    2. Remove mirroring (SET PARTNER OFF)
    3. Stop Mirroring endpoints (on principal and mirror)
    4. Alter Mirroring endpoints to use new IP addresses e.g. ALTER ENDPOINT SET LISTENER_IP =
    5. Start endpoints on principal and mirror
    6. Enable mirroring (ALTER DATABASE <dbname> SET PARTNER = TCP://x.x.x.x)

    I’ll try and find the exact scripts we used and upload them here.

    New HP Resources on SQL Server 2008 Data Warehousing / Business Intelligence

    I haven’t had a chance to look through these yet so I can’t comment on the content but I thought I would post here to share these new resources.

    HP Business Intelligence Sizer for Microsoft SQL Server 2005/2008

    http://h71019.www7.hp.com/ActiveAnswers/us/en/sizers/microsoft-sql-bi.html

    HP Whitepapers on SQL Server 2008 Data Warehousing / Business Intelligence

    http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA2-5263ENW.pdf

    http://h20195.www2.hp.com/V2/getdocument.aspx?docname=4AA2-8173ENW.pdf

    http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA2-7162ENW.pdf

    Troubleshooting SQL Server Transactional Replication

    I often see questions about transactional replication performance problems, especially around latency/delays between the publisher and subscriber(s) so I’ve put a few pointers below on what to investigate.  Latency between the publisher, distributor and subscriber(s) is, more often than not, the symptom of other causes for example, poor I/O capacity on subscribers, blocking/locking, hotspots on indexes, high number of virtual log files etc.

    Troubleshooting tips:

    • Look at perfmon counters (disk reads and writes/sec, avg disk/sec read and avg disk/sec write) to ensure there is enough capacity and that the latency on the data and log drives are within our recommended boundaries. 
    • Look at waitstats (use DMVstats – highly recommended) to see what resources are waiting. This will give you a good indication where the bottleneck is. 
    • Look at the transactional replication performance monitor counters (pending Xacts, transactions/sec, latency etc.)
    • Check the number of VLF’s http://support.microsoft.com/kb/949523 as this can have a negative impact on log scanning if there are a very high number of VLF’s, I tend to ensure this value is below 1000.
    • Use tracer tokens to check latency from publisher to distributor to subscriber
    • Use agent logging to external files, -outputverboselevel 2 –output <dir\file> to troubleshoot data issues
    • Look in mslogreader_history, msdistribution_history & msrepl_errors in distribution database
    • Consider external factors e.g. consult the network/SAN specialists to check external issues such as network bandwidth/array performance issues etc.
    • If blocking is suspected then use the blocked process trace definition.  I can highly recommend this as it provides incredibly valuable information about the blocked and blocking processes. 
    • If you are using database mirroring in conjunction with transaction replication then the log reader may be have to wait for the record to be hardened on the mirror.  This can be avoided by using trace flag 1448 on the publisher.

    Optimisation tips:

    • Use agent profiles to optimise for workloads
    • Implement Read Committed Snapshot Isolation (RCSI) on subscribers to alleviate reader/writer blocking (when doing this consider the impact on tempdb as this is where the version store is located)
    • Ensure the distribution history clean-up job is correctly trimming the distribution database tables.
    • If there are data consistency issues, consider using tablediff to compare data in publisher/subscriber tables (warning: this may take a while with large volumes of data) however tablediff can in fact be used against a subset of the data using views.
    • Be careful about using –skiperrors to bypass consistency errors http://support.microsoft.com/kb/327817
    • Consider using –SubscriptionStreams on the distribution agent to use multiple threads to apply the data to the subscribers, read this http://support.microsoft.com/kb/956600 and this http://support.microsoft.com/kb/953199
    • If initialising from a backup/copy of the database, don’t enforce integrity on the subscribers.  Drop the constraints or use the NOTFORREPLICATION option. 
    Fusion IO 640GB SSD PCI-Express Cards

    We recently had the opportunity to test a couple of the Fusion IO PCI-Express 640GB SSD cards http://www.fusionio.com/Products.aspx in a Dell R900 server, unfortunately time was against us and we were unable to do this. The Fusion IO SSD cards would dramatically increase the IOPS capacity and personally, I think they would be suited to storing tempdb.  I’m a bit cautious about using SSD for data and transaction log so tempdb seems like the best solution. 

    SQL Server 2008 Books Online (July 2009) Update

    Only just noticed that a new revision of the SQL Server 2008 Books Online documentation has been published, the download is here http://www.microsoft.com/downloads/details.aspx?FamilyID=765433f7-0983-4d7a-b628-0a98145bcb97&DisplayLang=en

    Using an SSIS package to monitor and archive the default trace file

    I find it frustrating that the SQL Server 2005/2008 default trace is continually overwritten and there is no way to store x number of files or x MBs of data.  As a workaround, I developed an SSIS package to monitor the \LOG folder and automatically archive the default trace file whenever a new file is created.

    This consists of a FOR LOOP container, a Script Task and a File System Task plus a whole bunch of variables and property expressions.

    The guts of the package is really in the Script Task as this is where I use a WMI query to monitor the \LOG folder for .trc files.  The file is then renamed (date-time-servername-file) to another folder\share which can be a UNC structure e.g. \\server\share.  This way I have a permanent record of the basic server activity for root cause analysis/troubleshooting. 

    The screenshot below shows the basic structure of the package.

    image

    SQL Server 2008 R2 Master Data Services

    In case you missed the Tech-Ed 2009 announcement, you can find the info here http://www.microsoft.com/sqlserver/2008/en/us/MDS.aspx

    “On initial scoping, it was determined that 'Bulldog' would ship as part of Microsoft Office SharePoint in the O14 wave.  At TechEd 2009, we announced a change in packaging for the new MDM capabilities. Project 'Bulldog' will now ship as part of the next release of SQL Server codenamed ‘Kilimanjaro’ as 'SQL Server Master Data Services. 

    This means that in addition to new capabilities such as Self Service BI and multi-server management, SQL Server ‘Kilimanjaro’ will also provide customers with a rich platform for MDM through SQL Server Master Data Services. Customers who have purchased Software Assurance (SA) should view this as net new value and innovation that they will have access to as a result of their investments in SA.”

    Initializing a Transactional Replication Subscriber from an Array-Based Snapshot

    After a few late nights, some coffee and a few review cycles, a new article has just been published on the SQLCAT site which provides an overview of the subscriber initialisation techniques for transactional replication and, more specifically using an array-based snapshot http://sqlcat.com/technicalnotes/archive/2009/05/04/initializing-a-transactional-replication-subscriber-from-an-array-based-snapshot.aspx

    Technet Webcast: An Early look at SQL Server ‘Kilimanjaro’ and project ‘Madison’

    http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032413070&EventCategory=4&culture=en-US&CountryCode=US

    SQL Server 2005 Service Pack 3 Cumulative Update 3

    Just released http://support.microsoft.com/kb/967909/

    Using a C# script task in SSIS to download a file over http

    I recently had a requirement to automate downloading a file from an website and then perform ETL on the data in the file.  Fortunately, this is possible via the script task in SSIS (note that this is using SQL Server 2008 Integration Services).  I found a couple of web references to do this in VB.NET but I prefer C# so modified the code and made some adjustments to suit my (debugging) needs.  I set two package variables, RemoteURI and LocalFileName, to store the URL (source) and filename (destination).

    This works really well and I can change the variables at run-time using property expressions

    public void Main()
           {
               WebClient myWebClient;
               string RemoteURI;
               string LocalFileName;
               bool FireAgain = true;

               Dts.Log("entering download..", 999, null);

               try
               {
                   myWebClient = new WebClient();

                   RemoteURI = Dts.Variables["User::vPipeline"].Value.ToString();
                   LocalFileName = Dts.Variables["User::vLocalFileName"].Value.ToString();

                   Console.WriteLine(RemoteURI);
                   Console.WriteLine(LocalFileName);

                   MessageBox.Show(RemoteURI);
                   MessageBox.Show(LocalFileName);

                   // Notification
                   Dts.Events.FireInformation(0, String.Empty, String.Format("Downloading '{0}' from '{1}'", LocalFileName, RemoteURI), String.Empty, 0, ref FireAgain);

                   // Download the file
                   myWebClient.DownloadFile(RemoteURI, LocalFileName);

                   Dts.TaskResult = (int)ScriptResults.Success;

               }

               catch (Exception ex)
               {
                   // Catch and handle error
                   Dts.Events.FireError(0, String.Empty, ex.Message, String.Empty, 0);
                   Dts.TaskResult = (int)ScriptResults.Failure;
               }

           }

    Microspotting?

    [Non-SQL related] I’ve just discovered an interesting site called Microspotting which came to my attention courtesy of the grapevine. At first I wasn’t sure what this was about but after a little bit of digging, it would appear that the site is dedicated to sharing stories about Microsoft FTE’s. As the site proclaims, it is a bit like having an internal paparazzi but provides some good insight.  Take a look here http://www.microspotting.com/

    SQL Server 2008 CU4 released

    Just published here http://support.microsoft.com/kb/963036

    More Posts Next page »
    Page view tracker