SQL Server 2008 R2 Datacenter
SQL Server 2008 R2 Parallel Data Warehouse
http://www.microsoft.com/sqlserver/2008/en/us/R2-editions.aspx
I have recently been working on a project where we need to load and transform data held in Excel 2007 into a SQL Server database. The Excel sheets were fairly complex and had different structures. Fortunately, SSIS 2008 provides some good tools to handle this situation.
Lessons learned
1. It is possible to read the Excel metadata using the mechanism listed in this knowledge base article, http://support.microsoft.com/kb/318452 HOW TO: Retrieve Meta Data from Excel by Using GetOleDbSchemaTable in Visual C# .NET. You may be wondering why I want to do this. Handling Excel sheets with different structures can be tricky so reading the metadata can help determine control flow processing i.e. which data flow to use to process the file.
2. Remember, if you are testing on x64 then your package will not execute if you are using the Excel source since it is not supported on 64-bit so you need to disable the Run64BitRuntime property as below

3. The Script component is really great when it comes to data manipulation in the data flow. This can be used as either a source, transformation or destination and allows you to manipulate the rows in the pipeline using either VB.NET or C#.
4. As mentioned, Excel files can also be read using the Execute SQL task in the control flow, which is a nice feature e.g. SELECT * FROM $PivotData
5. The File System Task can make file handling a lot easier. Combine this with the use of variables and property expressions, dynamic file manipulation became a whole lot easier. For example, after processing I either move the file to another folder based on the outcome e.g. success or failure.
This question came up today: how can I change the IP address between the principal and mirror in database mirroring?
Not a common operation but this procedure worked in an isolated lab environment where we had full control over the application and transaction activity. We wanted to introduce a WAN latency injector so needed to change the database mirroring IP addresses on the principal and mirror.
- Stop application activity
- Remove mirroring (SET PARTNER OFF)
- Stop Mirroring endpoints (on principal and mirror)
- Alter Mirroring endpoints to use new IP addresses e.g. ALTER ENDPOINT SET LISTENER_IP =
- Start endpoints on principal and mirror
- Enable mirroring (ALTER DATABASE <dbname> SET PARTNER = TCP://x.x.x.x)
I’ll try and find the exact scripts we used and upload them here.
I haven’t had a chance to look through these yet so I can’t comment on the content but I thought I would post here to share these new resources.
HP Business Intelligence Sizer for Microsoft SQL Server 2005/2008
http://h71019.www7.hp.com/ActiveAnswers/us/en/sizers/microsoft-sql-bi.html
HP Whitepapers on SQL Server 2008 Data Warehousing / Business Intelligence
http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA2-5263ENW.pdf
http://h20195.www2.hp.com/V2/getdocument.aspx?docname=4AA2-8173ENW.pdf
http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA2-7162ENW.pdf
I often see questions about transactional replication performance problems, especially around latency/delays between the publisher and subscriber(s) so I’ve put a few pointers below on what to investigate. Latency between the publisher, distributor and subscriber(s) is, more often than not, the symptom of other causes for example, poor I/O capacity on subscribers, blocking/locking, hotspots on indexes, high number of virtual log files etc.
Troubleshooting tips:
- Look at perfmon counters (disk reads and writes/sec, avg disk/sec read and avg disk/sec write) to ensure there is enough capacity and that the latency on the data and log drives are within our recommended boundaries.
- Look at waitstats (use DMVstats – highly recommended) to see what resources are waiting. This will give you a good indication where the bottleneck is.
- Look at the transactional replication performance monitor counters (pending Xacts, transactions/sec, latency etc.)
- Check the number of VLF’s http://support.microsoft.com/kb/949523 as this can have a negative impact on log scanning if there are a very high number of VLF’s, I tend to ensure this value is below 1000.
- Use tracer tokens to check latency from publisher to distributor to subscriber
- Use agent logging to external files, -outputverboselevel 2 –output <dir\file> to troubleshoot data issues
- Look in mslogreader_history, msdistribution_history & msrepl_errors in distribution database
- Consider external factors e.g. consult the network/SAN specialists to check external issues such as network bandwidth/array performance issues etc.
- If blocking is suspected then use the blocked process trace definition. I can highly recommend this as it provides incredibly valuable information about the blocked and blocking processes.
- If you are using database mirroring in conjunction with transaction replication then the log reader may be have to wait for the record to be hardened on the mirror. This can be avoided by using trace flag 1448 on the publisher.
Optimisation tips:
- Use agent profiles to optimise for workloads
- Implement Read Committed Snapshot Isolation (RCSI) on subscribers to alleviate reader/writer blocking (when doing this consider the impact on tempdb as this is where the version store is located)
- Ensure the distribution history clean-up job is correctly trimming the distribution database tables.
- If there are data consistency issues, consider using tablediff to compare data in publisher/subscriber tables (warning: this may take a while with large volumes of data) however tablediff can in fact be used against a subset of the data using views.
- Be careful about using –skiperrors to bypass consistency errors http://support.microsoft.com/kb/327817
- Consider using –SubscriptionStreams on the distribution agent to use multiple threads to apply the data to the subscribers, read this http://support.microsoft.com/kb/956600 and this http://support.microsoft.com/kb/953199
- If initialising from a backup/copy of the database, don’t enforce integrity on the subscribers. Drop the constraints or use the NOTFORREPLICATION option.
We recently had the opportunity to test a couple of the Fusion IO PCI-Express 640GB SSD cards http://www.fusionio.com/Products.aspx in a Dell R900 server, unfortunately time was against us and we were unable to do this. The Fusion IO SSD cards would dramatically increase the IOPS capacity and personally, I think they would be suited to storing tempdb. I’m a bit cautious about using SSD for data and transaction log so tempdb seems like the best solution.
Only just noticed that a new revision of the SQL Server 2008 Books Online documentation has been published, the download is here http://www.microsoft.com/downloads/details.aspx?FamilyID=765433f7-0983-4d7a-b628-0a98145bcb97&DisplayLang=en
I find it frustrating that the SQL Server 2005/2008 default trace is continually overwritten and there is no way to store x number of files or x MBs of data. As a workaround, I developed an SSIS package to monitor the \LOG folder and automatically archive the default trace file whenever a new file is created.
This consists of a FOR LOOP container, a Script Task and a File System Task plus a whole bunch of variables and property expressions.
The guts of the package is really in the Script Task as this is where I use a WMI query to monitor the \LOG folder for .trc files. The file is then renamed (date-time-servername-file) to another folder\share which can be a UNC structure e.g. \\server\share. This way I have a permanent record of the basic server activity for root cause analysis/troubleshooting.
The screenshot below shows the basic structure of the package.

In case you missed the Tech-Ed 2009 announcement, you can find the info here http://www.microsoft.com/sqlserver/2008/en/us/MDS.aspx
“On initial scoping, it was determined that 'Bulldog' would ship as part of Microsoft Office SharePoint in the O14 wave. At TechEd 2009, we announced a change in packaging for the new MDM capabilities. Project 'Bulldog' will now ship as part of the next release of SQL Server codenamed ‘Kilimanjaro’ as 'SQL Server Master Data Services.
This means that in addition to new capabilities such as Self Service BI and multi-server management, SQL Server ‘Kilimanjaro’ will also provide customers with a rich platform for MDM through SQL Server Master Data Services. Customers who have purchased Software Assurance (SA) should view this as net new value and innovation that they will have access to as a result of their investments in SA.”
After a few late nights, some coffee and a few review cycles, a new article has just been published on the SQLCAT site which provides an overview of the subscriber initialisation techniques for transactional replication and, more specifically using an array-based snapshot http://sqlcat.com/technicalnotes/archive/2009/05/04/initializing-a-transactional-replication-subscriber-from-an-array-based-snapshot.aspx
I recently had a requirement to automate downloading a file from an website and then perform ETL on the data in the file. Fortunately, this is possible via the script task in SSIS (note that this is using SQL Server 2008 Integration Services). I found a couple of web references to do this in VB.NET but I prefer C# so modified the code and made some adjustments to suit my (debugging) needs. I set two package variables, RemoteURI and LocalFileName, to store the URL (source) and filename (destination).
This works really well and I can change the variables at run-time using property expressions
public void Main()
{
WebClient myWebClient;
string RemoteURI;
string LocalFileName;
bool FireAgain = true;
Dts.Log("entering download..", 999, null);
try
{
myWebClient = new WebClient();
RemoteURI = Dts.Variables["User::vPipeline"].Value.ToString();
LocalFileName = Dts.Variables["User::vLocalFileName"].Value.ToString();
Console.WriteLine(RemoteURI);
Console.WriteLine(LocalFileName); MessageBox.Show(RemoteURI);
MessageBox.Show(LocalFileName);
// Notification
Dts.Events.FireInformation(0, String.Empty, String.Format("Downloading '{0}' from '{1}'", LocalFileName, RemoteURI), String.Empty, 0, ref FireAgain);
// Download the file
myWebClient.DownloadFile(RemoteURI, LocalFileName);
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
// Catch and handle error
Dts.Events.FireError(0, String.Empty, ex.Message, String.Empty, 0);
Dts.TaskResult = (int)ScriptResults.Failure;
}
}
[Non-SQL related] I’ve just discovered an interesting site called Microspotting which came to my attention courtesy of the grapevine. At first I wasn’t sure what this was about but after a little bit of digging, it would appear that the site is dedicated to sharing stories about Microsoft FTE’s. As the site proclaims, it is a bit like having an internal paparazzi but provides some good insight. Take a look here http://www.microspotting.com/