Benjamin Wright-Jones

Exploring Information, Infrastructure and Solution Architecture

  • Benjamin Wright-Jones

    Using a C# script task in SSIS to download a file over http


    I recently had a requirement to automate downloading a file from an website and then perform ETL on the data in the file.  Fortunately, this is possible via the script task in SSIS (note that this is using SQL Server 2008 Integration Services).  I found a couple of web references to do this in VB.NET but I prefer C# so modified the code and made some adjustments to suit my (debugging) needs.  I set two package variables, RemoteURI and LocalFileName, to store the URL (source) and filename (destination).

    This works really well and I can change the variables at run-time using property expressions

    public void Main()
               WebClient myWebClient;
               string RemoteURI;
               string LocalFileName;
               bool FireAgain = true;

               Dts.Log("entering download..", 999, null);

                   myWebClient = new WebClient();

                   RemoteURI = Dts.Variables["User::vPipeline"].Value.ToString();
                   LocalFileName = Dts.Variables["User::vLocalFileName"].Value.ToString();



                   // Notification
                   Dts.Events.FireInformation(0, String.Empty, String.Format("Downloading '{0}' from '{1}'", LocalFileName, RemoteURI), String.Empty, 0, ref FireAgain);

                   // Download the file
                   myWebClient.DownloadFile(RemoteURI, LocalFileName);

                   Dts.TaskResult = (int)ScriptResults.Success;


               catch (Exception ex)
                   // Catch and handle error
                   Dts.Events.FireError(0, String.Empty, ex.Message, String.Empty, 0);
                   Dts.TaskResult = (int)ScriptResults.Failure;


  • Benjamin Wright-Jones

    Troubleshooting SQL Server Transactional Replication


    I often see questions about transactional replication performance problems, especially around latency/delays between the publisher and subscriber(s) so I’ve put a few pointers below on what to investigate.  Latency between the publisher, distributor and subscriber(s) is, more often than not, the symptom of other causes for example, poor I/O capacity on subscribers, blocking/locking, hotspots on indexes, high number of virtual log files etc.

    Troubleshooting tips:

    • Look at perfmon counters (disk reads and writes/sec, avg disk/sec read and avg disk/sec write) to ensure there is enough capacity and that the latency on the data and log drives are within our recommended boundaries. 
    • Look at waitstats (use DMVstats – highly recommended) to see what resources are waiting. This will give you a good indication where the bottleneck is. 
    • Look at the transactional replication performance monitor counters (pending Xacts, transactions/sec, latency etc.)
    • Check the number of VLF’s as this can have a negative impact on log scanning if there are a very high number of VLF’s, I tend to ensure this value is below 1000.
    • Use tracer tokens to check latency from publisher to distributor to subscriber
    • Use agent logging to external files, -outputverboselevel 2 –output <dir\file> to troubleshoot data issues
    • Look in mslogreader_history, msdistribution_history & msrepl_errors in distribution database
    • Consider external factors e.g. consult the network/SAN specialists to check external issues such as network bandwidth/array performance issues etc.
    • If blocking is suspected then use the blocked process trace definition.  I can highly recommend this as it provides incredibly valuable information about the blocked and blocking processes. 
    • If you are using database mirroring in conjunction with transaction replication then the log reader may be have to wait for the record to be hardened on the mirror.  This can be avoided by using trace flag 1448 on the publisher.

    Optimisation tips:

    • Use agent profiles to optimise for workloads
    • Implement Read Committed Snapshot Isolation (RCSI) on subscribers to alleviate reader/writer blocking (when doing this consider the impact on tempdb as this is where the version store is located)
    • Ensure the distribution history clean-up job is correctly trimming the distribution database tables.
    • If there are data consistency issues, consider using tablediff to compare data in publisher/subscriber tables (warning: this may take a while with large volumes of data) however tablediff can in fact be used against a subset of the data using views.
    • Be careful about using –skiperrors to bypass consistency errors
    • Consider using –SubscriptionStreams on the distribution agent to use multiple threads to apply the data to the subscribers, read this and this
    • If initialising from a backup/copy of the database, don’t enforce integrity on the subscribers.  Drop the constraints or use the NOTFORREPLICATION option. 
  • Benjamin Wright-Jones

    How can I generate a T-SQL script for just the indexes?


    I was trying to work out the best way to generate a T-SQL script for only the indexes and I thought that this would be really easy in the Management Studio interface.  I was mistaken.  SQL Server Management Studio also generates scripts for the tables when you want the indexes.  This is not great so I looked at other methods.  Method 1) Use T-SQL to generate a script using the metadata or method 2) use SMO (SQL Server Management Objects).  Although I like doing things with T-SQL, I thought I’d give SMO a try and below is the result.  I just hope this is made easier in future releases.

    using System;
    using System.Collections.Generic;
    using Microsoft.SqlServer.Management.Common;
    using Microsoft.SqlServer.Management.Smo;
    using System.Text;
    using System.Collections.Specialized;

    namespace SQLRMOSMO
        class SMOTest   // Scripts all indexes for a specified database
            static void Main(string[] args)
                string servername;
                string databasename;
                servername = "<server\\instance>";
                databasename = "<database>";

                Server server = new Server(servername);
                Database db = server.Databases[databasename];
                ScriptingOptions so = new ScriptingOptions();
                so.ScriptDrops = false;
                so.Indexes = true;
                so.IncludeIfNotExists = false;
                so.DriForeignKeys = false;
                so.FileName = "c:\indexes.sql";
                so.AppendToFile = true;
                foreach (Table t in db.Tables)
                    foreach (Index i in t.Indexes)


  • Benjamin Wright-Jones

    Using SSIS with SQL Azure databases


    I’m quite surprised how easy it is to setup an SSIS package to read and write data to a SQL Azure database.  A simple data flow task with an ADO.NET Destination is easy to setup, just make sure the SQL Azure table has a clustered index otherwise writing data will not work (as this is a pre-requisite). 


    The SSIS team have added a BULK INSERT option on the ADO.NET Destination in SQL Server 2008 R2 to improve performance, this blog entry covers the details:

    On the topic of indexes, it looks like there is a bug in the scripting engine in SQL Server 2008 R2.  Scripting the index as DROP to an SSMS window returns the following however the ONLINE syntax is not supported with SQL Azure

  • Benjamin Wright-Jones

    Editing XML using PowerShell


    I recently had a requirement to edit an XML document using PowerShell  Fortunately, this is a relatively easy task as shown below.  I’ve used a system environment variable %USERPROFILE% to populate the path to the file.

    The script modifies a value in the XML document (highlighted in the example below) and ensures this is persisted using the .Save method.  The intention is to execute this script during user logon to ensure the value is always disabled in the application, ProClarity Web Professional 6.3 SP3.


    REM Code changes CheckQuerySize=”Yes” to CheckQuerySize=”No”

    REM Windows XP
    $xmlFile = “$env:userprofile\Application Data\ProClarity\Client\Options.xml”
    $xmlDoc = [XML](gc $xmlFile)
    $xmldoc.ProClarity5Options.ProClarityApp.CheckQuerySize = "No"

    REM Windows Vista and Windows 7
    $xmlFile = “$env:userprofile\AppData\Roaming\ProClarity\Client\Options.xml”
    $xmlDoc = [XML](gc $xmlFile)
    $xmldoc.ProClarity5Options.ProClarityApp.CheckQuerySize = "No"

  • Benjamin Wright-Jones

    SQL Server 2005 Extended Triggers


    Ok, I am impressed with everything in SQL Server 2005, especially extended triggers to capture DDL actions.  I was playing around with these today and I can see how this will be a nice feature, especially for audit trails.  I've included a simple script below.

    -- create sample db
    database trigtestdb

    use trigtestdb
    create table tblextrig (eventid int identity, eventdata xml)

    -- create event driven trigger
    create trigger trig_ddl
    on database
    for create_table, drop_table, alter_table
    insert tblextrig values (eventdata())

    -- do some ddl
    create table ben (id int)
    drop table ben
    create table ben (id int)

    -- check if DML events have been logged by trigger
    select eventid, eventdata from tblextrig

    -- clean up
    use master

    drop database trigtestdb


  • Benjamin Wright-Jones

    Intellisense add-in for SQL Server


    PromptSQL have produced an intellisense add-in for SQL Server - this is very cool. 

  • Benjamin Wright-Jones

    Processing Excel files in SSIS 2008 – Lessons Learned


    I have recently been working on a project where we need to load and transform data held in Excel 2007 into a SQL Server database.  The Excel sheets were fairly complex and had different structures.  Fortunately, SSIS 2008 provides some good tools to handle this situation. 

    Lessons learned

    1. It is possible to read the Excel metadata using the mechanism listed in this knowledge base article, HOW TO: Retrieve Meta Data from Excel by Using GetOleDbSchemaTable in Visual C# .NET.  You may be wondering why I want to do this.  Handling Excel sheets with different structures can be tricky so reading the metadata can help determine control flow processing i.e. which data flow to use to process the file.

    2. Remember, if you are testing on x64 then your package will not execute if you are using the Excel source since it is not supported on 64-bit so you need to disable the Run64BitRuntime property as below


    3. The Script component is really great when it comes to data manipulation in the data flow.  This can be used as either a source, transformation or destination and allows you to manipulate the rows in the pipeline using either VB.NET or C#.

    4. As mentioned, Excel files can also be read using the Execute SQL task in the control flow, which is a nice feature e.g. SELECT * FROM $PivotData

    5. The File System Task can make file handling a lot easier.  Combine this with the use of variables and property expressions, dynamic file manipulation became a whole lot easier.  For example, after processing I either move the file to another folder based on the outcome e.g. success or failure. 

  • Benjamin Wright-Jones

    SQL Server 2000 Post SP4 Hotfix Rollup Packages


    I really should keep a closer eye on those SQL Server knowledge base RSS feeds!  There are two new SQL Server 2000 Post SP4 hotfix rollup packages available both for the Database Engine and Analysis Services. 

    You can get them here:

    Hotfix rollup package for SQL Server 2000 Analysis Services Service Pack 4 Build 2174

    A cumulative hotfix package is available for SQL Server 2000 Service Pack 4 build 2187

  • Benjamin Wright-Jones

    SQL Server 2005 Service Pack 2 required for Longhorn & Vista


    In case you were wondering, Longhorn & Vista will only support SQL Server 2005 Service Pack 2. This should be out as a Community Technology Preview (CTP) build sometime soon (I'll blog about the forthcoming changes once we have publicly announced the CTP details). A more detailed explanation on SQL Support for Longhorn & Vista is posted here:

  • Benjamin Wright-Jones

    Database mail is cool...


    I'm now a big fan of database mail in SQL Server 2005 - it is so much better than SQLMail.  For starters, it is now cluster-aware (customers have been asking for this for a while now), supports SMTP and takes advantage of service broker.  When database mail is configured, it is at the user database level.  Various objects are installed in the database being enabled for mail.  The stored procedure, sp_send_dbmail, is one of these objects and it takes a variety of parameters e.g. queries can be used and the results can be attached as a file.  The following script provides a simple example:

    EXEC dbo.sp_send_dbmail

    @profile_name = 'YUKON01_Database_Mail',

    @recipients = '<email address>',

    @body = 'This is a query test',

    @subject = 'Query Test',

    @query = '<T-SQL Statement>',

    @attach_query_result_as_file = 1 ;

    There are also two DMV's which can be queried to look at the metadata and determine when the last message was sent; these are sysmail_mailitems & sysmail_log.  Please feel free to post any comments on your thoughts regarding database mail. 

  • Benjamin Wright-Jones

    Analysing SQL Server Error Logs; Useful Tool


    I recently found quite a useful tool hidden in the Windows Installer SDK developer Kit called WiLogUtl.exe which can be used to analyze / parse not just SQL Server error logs, but any form of text-based log file.  The tool is embedded in the Windows Platform SDK (XPSP2) which can be downloaded from here  The binary file is installed under the following directory tree C:\Program Files\Microsoft Platform SDK for Windows XP SP2\Bin. It's provides a simple yet useful interface for browsing and analyzing those log files which are generated by failed setups and service pack installs. 

    I've recently corrected this post (thanks to Matt Tisdel) as I included an incorrect URL (the correct one is shown above).  Further information on WiLogUtl.exe can be found in MSDN here and here

    On a slightly different topic, my new favourite RSS reader is now Sauce Reader from Synop - this is great for consolidating RSS reads.

  • Benjamin Wright-Jones

    Add-on Components for SSIS

    I've just discovered a new 3rd party add-on for Integration Services by  There is a SSIS wizard which simplifies the process of creating a custom task or data flow component.  There is also the Extra Sort component which apparently provides enchanced performance for sorting large data sets.  I'll take a look at these add-ons and post my comments later. 
  • Benjamin Wright-Jones

    Redux: Using a C# script task in SSIS to download a file over http


    A few people have asked for further information about the C# script task which I blogged about (quite a while ago).  I mistakenly forgot to add the full source code, sorry everyone.  Here is the link to the original blog post:

    I have since imported the SSIS package into Visual Studio 2010 (BIDS) and the code compiles without error.  Some of the code below is truncated on the right (just a formatting issue I need to resolve) but the core of the code is there.  

    As always, let me know if there are any problems. 

       Download a file over http using Script Task in SQL Server 2008 R2 Integration Services.   
    Two key variables, vSSOReportURL, which is constructed in a prior Script Task e.g. http://www..
    vSSOLocalFileName, which is the fully qualified reference for the downloaded file e.g. c:\

    */ using System; using System.Data; using Microsoft.SqlServer.Dts.Runtime; using System.Windows.Forms; using System.Net; using System.Net.Security; namespace ST_7e897e41dd5945f3b77366d32f0a97e0.csproj { [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute] public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase { #region VSTA generated code enum ScriptResults { Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success, Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure }; #endregion public void Main() { WebClient myWebClient; string RemoteURI; string LocalFileName; bool FireAgain = true; Variables vars = null; Dts.VariableDispenser.LockForRead("User::vSSOReportURL"); Dts.VariableDispenser.LockForRead("User::vSSOLocalFileName"); Dts.VariableDispenser.LockForWrite("User::vSSOReportURLIndicator"); Dts.VariableDispenser.GetVariables(ref vars); try { // Ignore certificate warnings ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(delegate { return true; }); // Initiate webclient download, use default credentials (current login) myWebClient = new WebClient(); myWebClient.Credentials = CredentialCache.DefaultCredentials; RemoteURI = vars["User::vSSOReportURL"].Value.ToString(); LocalFileName = vars["User::vSSOLocalFileName"].Value.ToString(); // Log provider notification Dts.Events.FireInformation(0, String.Empty, String.Format("Downloading '{0}' from '{1}'", LocalFileName, RemoteURI), String.Empty, 0, ref FireAgain); // Download the file myWebClient.DownloadFile(RemoteURI, LocalFileName); // Set report URL indicator, this is used to determine the http source of the // download i.e. vSSOReportURL or vSSOReportURLRetry for the message which is // written to the table vars["User::vSSOReportURLIndicator"].Value = 0; // Return success Dts.TaskResult = (int)ScriptResults.Success; } catch (Exception ex) { // Catch and handle error Dts.Events.FireError(0, String.Empty, ex.Message, String.Empty, 0); Dts.TaskResult = (int)ScriptResults.Failure; } } } }

    The files are extracted using an Execute Process Task (with 7-Zip) as shown below:


    And the arguments are set using the expression (below).  There are probably better ways of doing this but I found this worked well.


    The .zip file is then archived using a File System task and the extracted file is renamed to .xlsx. 

  • Benjamin Wright-Jones

    Unattended clusters installs in SQL Server 2005 / Mount Point Support


    Ok, so not exactly a killer feature, but I know a few customers who have been asking for this for a while especially as most of the builds are engineered and deployed via SMS.  I think this is one of those 'nice to have' features. From Beta 2 BOL: "For an unattended installation of a failover cluster, you must use /qn.  The /qn switch specifies a silent installation, displaying no user interface dialog boxes, even in cases of Setup errors or failure. If the /qn switch is used, all Setup messages are written to Setup log files". You'll also be pleased to hear that we will now support mount points on a Windows Server 2003 Enterprise Edition cluster running SQL Server 2005.  This should be really beneficial to those who have a requirement to exceed the 26 drive letter limitation (great news for consolidation.

  • Benjamin Wright-Jones

    A DTS problem which is bothering me...


    I can't believe it's almost July and it's pouring down with rain.  I've been thinking about a problem a customer asked me about last week regarding DTS and the lack of interoperability between SQL Server 7 and 2000.  To cut a long story short, I was asked to document the known DTS bugs for each (public) build of SQL Server 7 and 2000 - so, RTM to SP3/4 etc.  As you probably know, the security in 2000 was changed and this introduced some compatibility problems; there are also some caveats around using the SQL Server 2000 client tools to edit packages designed in 7.0.  However, this is just the tip of the iceberg.  There are many DTS KB's and I need to check out.  There's also a great resource here: which I highly recommend - I believe it's run by an MVP.  I've had a lot of people ask for a setup option to not install DTS - I can understand how this would solve a lot of problems and I know SQL Server 2005 addresses this. 

  • Benjamin Wright-Jones

    Making a database clone using SMO


    I’ve been playing around with SMO a bit more over the past few days as I’m finding it really useful in some situations.  Following on from previous test with SMO and indexes, I was wondering if I could script the database statistics and histograms in SMO rather than use Management Studio.  As a quick reminder, it is possible to make a ‘clone’ of the database statistics and histograms in SQL Server 2005 (with SP2).  This option is buried in the generate scripts window.  A database clone can be very useful when troubleshooting query plans rather than have to rely on a full copy of the database (data+objects+stats).  I actually think using SMO directly in a C# console window is faster than using the SSMS interface but I need to do more testing to validate this. 

    You’ll see below that I generate scripts of the CREATE DATABASE statement, tables plus statistics and histograms, stored procedures, user-defined functions, partition schemes and partition functions, whilst excluding all system objects. 

    using System;
    using System.Collections.Generic;
    using Microsoft.SqlServer.Management.Common;
    using Microsoft.SqlServer.Management.Smo;
    using System.Text;
    using System.Collections.Specialized;

    namespace SQLSMO
        class SMOClone
            static void Main(string[] args)
                string servername;
                string databasename;

                servername = "servername\\instance";
                databasename = "databasename";

                Server server = new Server(servername);
                Database db = server.Databases[databasename];

                //include the database create syntax
                ScriptingOptions dbso = new ScriptingOptions();
                dbso.FileName = "e:\\" + databasename + "-create.sql";
                dbso.AppendToFile = true;

                Console.WriteLine("Scripting database: " + databasename + ". Please wait...");


                //scripting options
                ScriptingOptions tso = new ScriptingOptions();
                tso.ScriptDrops = false;
                tso.Indexes = true;
                tso.ClusteredIndexes = true;
                tso.PrimaryObject = true;
                tso.SchemaQualify = true;
                tso.NoIndexPartitioningSchemes = false;
                tso.NoFileGroup = false;
                tso.DriPrimaryKey = true;
                tso.DriChecks = true;
                tso.DriAllKeys = true;
                tso.AllowSystemObjects = false;
                tso.IncludeIfNotExists = false;
                tso.DriForeignKeys = true;
                tso.DriAllConstraints = true;
                tso.DriIncludeSystemNames = true;
                tso.AnsiPadding = true;
                tso.IncludeDatabaseContext = false;
                tso.FileName = "e:\\" + databasename + "-clone.sql";
                tso.AppendToFile = true;
                //include statistics and histogram data for db clone
                tso.OptimizerData = true;
                tso.Statistics = true;

                foreach (Table t in db.Tables)
                    if (!t.IsSystemObject)
                        Console.WriteLine("Scripting Table & Statistics: " + t);

                foreach (StoredProcedure sp in db.StoredProcedures)
                    if (!sp.IsSystemObject)
                        Console.WriteLine("Scripting Stored Procedure: " + sp);

                foreach (UserDefinedFunction udf in db.UserDefinedFunctions)
                    if (!udf.IsSystemObject)
                        Console.WriteLine("Scripting Function: " + udf);

                foreach (PartitionFunction pf in db.PartitionFunctions)
                    Console.WriteLine("Scripting Partition Function: " + pf);
                foreach (PartitionScheme ps in db.PartitionSchemes )
                    Console.WriteLine("Scripting Partition Scheme: " + ps);

                Console.Write("Scripting completed. Press any key to exit.");

  • Benjamin Wright-Jones

    SQL Server 2005 - Top 5 Favourite Features


    After experimenting with the numerous features of SQL Server 2005 in beta 2 and subsequent CTP releases, I have decided upon my favourite features.  Although they're not necessarily the most important, they will make my life so much easier. 

    1. Database Mirroring (High Protection Mode)
    2. CLR Hosting (for complex code / calculations)
    3. Full Bi-Directional (Peer-to-Peer) Replication 
    4. Full Text Enhancements
    5. SQLCMD (and the dedicated administrator connection)

    I should really mention the Service Broker however I haven't (as yet) looked at it in any detail.  There is a good resource slowly developing here  There is also a nice tool in dev called the Service Broker Explorer which provides more control over the elements in a GUI.  It can be downloaded from here:  I should mention that it is not intended for use on a production system; more of a learning tool. 

  • Benjamin Wright-Jones

    SQL Server Power Tools


    I have just gone through the process of having to build a stable Vista x64 OS on my new Lenovo T61p laptop - which I must say is very impressive.  Apart from the issues with Groove not completely supporting 64-bit (arghh), I installed SQL Server 2005 Developer Edition (SP2 + hotfix package) and then began thinking about what tools/utilities I use during my day-to-day job and which are a must-have on my laptop.  Below is the list of software which I generally install straight away (I'm sure there other useful tools out there):

    SQL Server 2000 Books Online (June 2007) - essential documentation for those disconnected working environments.

    SQL Server 2005 Books Online (September 2007) - keep up-to-date with the latest 2005 documentation

    SQL Server 2005 Express Edition (for testing different scenarios/features e.g. replication subscription)

    SQL Server 2005 Upgrade Advisor - handy when I need to demo or investigate upgrade issues.

    SQL Server 2005 Performance Dashboard Reports - Customers love this software as it exposes a lot of the DMV info in a nice graphical format. 

    SQL Server 2005 System Views Map - not quite as good as the SQL Server 2000 chm system table browser in the resource kit but still useful.

    SQL Server 2005 Samples and Sample Databases

    SQL Server DMV Stats v1.01

    Read80Trace - according to the PSS guys a new version should be out soon which will support SQL Server 2005.

    PSSDiag - useful diagnostic tool which is also


    Microsoft Premier Services Reporting Utility (PFE version) ttp:// (I use, and recommend, this version over the others as it supports more hardware platforms e.g. IA64.  Read more about MPS reports here

    Visual Studio 2008 (which includes Team Database Edition).  Read more here:

    MSDN Library (April 2007 Edition) – Includes Visual Studio 2005 Service Pack 1 documentation

    Virtual PC 2007 (must-have for testing legacy and future SQL Server releases)

    Data Mining add-ins for Office 2007 (also available as part of the feature pack

    3rd Party Software

    Redgate sqlprompt - until Katmai is released which has native intellisense :-)

    Redgate Refactor - which is great for automatically indenting my T-SQL logic

    Beyond Compare (for quick script/schema comparisons) by Scooter Software

    Erwin Data Modeler (invaluable ER modeler, great tool)

    Litespeed - I need to be familiar with Litespeed as a few customers use it for VLDB backups.  

    ..and how could I forget Oracle 10/11g for testing connectivity, SSIS ETL processes etc. for all those migration projects :-)

    Please let me know if you recommend any other useful SQL tools and I'll add them to the list. 

  • Benjamin Wright-Jones

    Check out the SQL Server 2005 Express Manager Preview...


    I had a look at the SQL Server 2005 Express Manager a few days ago and realised that it works with developer, standard and enterprise editions of SQL Server 2005.  I was also impressed to discover that it also works with SQL Server 2000 instances.  I should also mention that you'll need to install .NET Framework 2.0 - the link is embedded on the download page at the following location:

    The express manager is a light-weight client, similar to QA but not as feature rich.  It is a good alternative to the query edition in the SQL Server 2005 management studio (application launch time is much better in Express Manager).  I just hope that a drop-down combo is added in future releases and that it also remembers server/instance name. 

    BTW, there are a lot of good whitepapers appearing on the public website, have a look at

  • Benjamin Wright-Jones

    I've been gone a while but I'm back....


    Hi all

    Apologies for not updating my blog as often as possible - I've been on vacation for the past couple of weeks and managed to lose most of my possesions when the villa was broken into!  Fortunately, my wallet and passport has been recovered in Valencia last week - they were found dumped in a skip - so I'm in the process of trying to recover these from the Spanish Guardia Civil (which is not easy)!

    Everything is almost sorted out and I'm finally getting back into the SQL Server world.  There's a new SQL Server 2005 portal and some excellent whitepapers here:

    I'm currently working on a few support issues related EMPTYFILE and SQLXML - I'll post my findings once I've done some research and analysis.  It's too early to add further comments at the moment :-)



  • Benjamin Wright-Jones

    Building a data mart to analyse web log traffic

    I recently had the opportunity to build a data mart in order to analyse web logs from IIS servers.  The solution utilised SQL Server 2012 Database Engine, Integration Services, Analysis Services and Excel 2010 (to slice/dice the data).   

    Loading IIS logs (text files with fixed width spaces) was fairly challenging due as the column definitions would alter throughout.  I therefore used the .NET StreamReader class to read the metadata and detect changes in the fields. 

    The data flow is shown below.


    A snippet of the code in the Load Data (Source Script Component) is presented below

    Code Snippet
    1. // Get variables
    2. strSourceFile = Variables.vCurrentSourceFileName;
    3. intAuditLogFileIdentifier = Variables.vAuditExecutionFileIdentifier;
    5. try
    6. {
    7.     // Create an instance of StreamReader to read from a file.
    8.     // The using statement also closes the StreamReader.
    9.     using (StreamReader sr = new StreamReader(strSourceFile))
    10.     {
    11.         String line;
    12.         int intNumberOfFields = 0;
    13.         string[] strListOfFields = null;
    15.         Trace.WriteLine("Log File: " + strSourceFile);
    17.         // Output the source file name as the first line (debugging purposes)
    18.         OutputLogFileRawDataBuffer.AddRow();
    19.         OutputLogFileRawDataBuffer.colRawData = strSourceFile;
    21.         // Read and display lines from the file until the end of the file is reached.
    22.         while ((line = sr.ReadLine()) != null)


    Extracting the data from the file was relatively straightforward.  I placed the string into an array based on the fixed spacing between fields.  From the data, I was able to extract useful information such as browser type i.e. Safari, Chrome, IE and even browser version. 

    IP addresses were mapped to geolocation using the free GeoLite information (CSV data imported into the database).  I converted longitude and latitude to the spatial geography data type and presented this against a world map (I wanted to validate that the GeoLite data correctly mapped to the IP address e.g. country/city to IP address).

       1:  USE BiKitchen;
       3:  DECLARE @SRID int = 4326
       4:  DECLARE @pLat nvarchar(max)
       5:  DECLARE @pLong nvarchar(max)
       6:  DECLARE @g geography
       8:  -- Check longitude and latitude for London
       9:  SET @pLat = (SELECT CAST(Latitude AS nvarchar(max)) FROM [GeoData].[GeoLiteCity-Location] WHERE locid = 13547)
      10:  SET @pLong = (SELECT CAST(longitude AS nvarchar(max)) FROM [GeoData].[GeoLiteCity-Location] WHERE locid = 13547)
      12:  SET @g =  geography::STPointFromText('POINT(' +
      13:          @pLong + ' ' +
      14:          @pLat + ')', @SRID).BufferWithTolerance(11000, 1000, 0)
      16:  SELECT @pLat AS Latitude, @pLong AS Longitude
      17:  -- SELECT @g.Lat, @g.Long
      19:  -- Map the geography type to base world map data
      20:  -- View the result in the spatial tab to validate coordinates
      21:  SELECT @g AS spatiallocation
      22:  UNION ALL SELECT geog FROM World_Borders

    The star schema was built and deployed in SQL Server 2012 Analysis Services (UDM).  I found named calculations to be incredibly powerful way of extending the data model and making attributes more meaningful for end-users


    The data was presented using Excel 2010, a screenshot is shown below.  I found slicers to be extremely useful




    I thought it would be interesting to see what Excel 2013 had to offer so I tried to create a Power View report but this is not currently supported against the UDM.   There are however some nice enhancements to chart types so I’ll be looking at this in more detail.


  • Benjamin Wright-Jones

    General ravings...IDW9....


    Hello again I'm pleased to announce that the latest build of SQL Server 2005 includes two new T-SQL commands, INTERSECT and EXCEPT.  I'll talk about these a little later. I've just been on a great 2005 training course delivered by Niels Berglund of developmentor.  Niels has introduced me to some great features which I'll talk about in this blog.  I'm going to run through the labs again as I missed some of the slides on http endpoints and service broker.  There are so many cool things in SQL Server 2005 that its difficult to know where to begin.  SQLCLR is such a great enhancement P> I'll post again soon....

  • Benjamin Wright-Jones

    Site Updates & SQL Server Nuggets on MSDN


    I've made a few changes to the layout of the blog site - hopefully this is much easier to navigate.  Anyway, whilst recently browsing on Tim Sneath's blog I came accross a reference to SQL Server Nuggets.  This is a cool resource if you're looking for a gentle introduction to some of the CLR features in SQL Server 2005.  The short webcasts are also downnloadable (see links below)

    SQL Server 2005

    If you've got a couple of moments to spare, check out the MSDN Nuggets page as there are also some good .NET developer demos. 

  • Benjamin Wright-Jones

    Using SQL Server 7.0/2000 in a Hyper-Threaded Environment


    I've just read a good article which discusses using SQL Server in a hyper-threaded environment  To cut a long story short, we recommend that the maxdop (maximum degree of parallelism) setting should remain equal to the number of physical processors that are being used. 


    "[]...The performance of hyper-threaded environments varies. Conservative testing has shown 10 to 20 percent gains for SQL Server workloads, but the application patterns have a significant affect. You might find that some applications do not receive an increase in performance by taking advantage of hyper-threading. If the physical processors are already saturated, using logical processors can actually reduce the workload achieved.

    For example, applications that cause high levels of contention can cause decreased performance in a hyper-threaded environment. We recommend that you test your application thoroughly to make sure that a hyper-threaded environment provides you the performance gain that you want versus the purchase of equivalent physical CPUs. Hyper-threading can be very helpful but hyper-threading cannot replace the full power of an additional physical CPU"

Page 1 of 7 (167 items) 12345»