Now that SQL Server 2008 SP1 is released (download link here), it is time to test a very interesting capability: Service Pack slipstreaming. That means that you can save some time by doing an 'integrated' installation of SQL 2008 which includes the SP1 binaries and avoid having to apply SP1 later on. Windows had this capability for some time but only now is it officially supported for SQL Server 2008.
Scenario
This walk-through was conducted on a 2-node, 2-SQL instance Windows 2003 x86 cluster. The 2 instances were:
Objective and Tools
We want to upgrade the SQL 2000 instance to SQL 2008, in-place. Currently we will not be touching the SQL 2005 instance (maybe I will do another blog post for that if there are any specific observations.) The objective was to minimize downtime of either instance while performing an in-place upgrade to SQL Server 2008 SP1. To achieve the objective, we use 2 new features which are available in SQL Server 2008:
Step-by-step
1. Pre-requisites
1. Prepare the merged (slipstreamed) media
In my case I was just dealing with x86 instances, so I optimized some steps from the blogs above
2. Run setup on passive node
I ran setup from the command prompt, specifying the PCUSource by hand:
Initially I select the SQL 2000 instance, which is active on the other node:
In the cluster security policy screen, you need to enter the service account domain group names. Now something related to this step caused a problem for later on, I will explain it later:
At the Upgrade Rules screen in setup, you can verify that we are slipstreaming:
Also in Upgrade Rules you might be warned that any other SQL instances active on this node will be restarted due to cluster resource DLL update. This is very important if you have not planned on those other instances being restarted. So you should note it and factor it into your upgrade plans:
Later the cluster upgrade report clearly tells us it is going to upgrade just this (passive) node:
Once again we check we are slipstreaming:
The rest of setup was fairly uneventful and at the end I checked the sqlservr.exe version on this (passive) node:
If you check the old SQL 2000 installation folder, you will note that the binaries and other folders have been cleaned up.
3. Change group for failed Full-Text Search resource
At this stage you may notice that the full-text search service cluster resource is either stopped or plain does not even show up. In my case I received the following message in cluster admin. I just moved out the fulltext resource into another cluster group for the moment.
4. Failover and watch the upgrade of system databases happen
At this stage, you are ready to failover the SQL 2000 instance from (let's say) NodeA (which is still running SQL 2000 and has not been touched by upgrade process) to the NodeB (on which the SQL 2000 service has been upgraded, binaries-only, to SQL 2008). What is supposed to happen after the move group, is that the SQL 2000 instance is temporarily unavailable, SQL 2008 service starts up on NodeB, it then upgrades the system databases to SQL 2008 'format' and also upgrades the user databases to SQL 2008 'format'. This is technically a point of 'no-return' from which roll back means reinstalling SQL 2000 and restoring from backups.
In my case, things did not go well initially, and on failover from NodeA to NodeB, SQL 2008 instance failed to come online on NodeB, restarted and couple of times and then flipped back to NodeA. Checking the event log showed this message:
initerrlog: Could not open error log file ''. Operating system error = 3(The system cannot find the path specified.).
I verified the startup parameters using Configuration Manager on NodeB and also double-checked using Enterprise Manager on NodeA, and they were correct. The only other possibility was a permissions issue, and I used Process Monitor from SysInternals to track down the issue. It turned out that my SQL 2008 service account (which was the same as what the 2000 version used) did not have access to a registry key. Then it dawned on me that the domain group membership I referred to previously, was not granted. To fix this was easy, used AD Users and Computers and added my service account into this group:
After this was fixed, I was able to move group again to NodeB at which stage the database upgrades happened and the instance was online. Here are some random snippets from the SQL errorlog at the time of upgrade, just to show you what happened under the hood:
Total time from start to finish: 1:01 minutes. That means that my instance was potentially unavailable just for this long. That is pretty impressive!
5. Run setup on previously active node
The process on NodeA was pretty much the same as described above. All of the steps described above are relevant, except for #4 above, because the databases are already upgraded once. The only thing you may come across if you have not handled the full-text search resource issue would be an error in setup: 'The device is not ready':
I am not 100% sure, but it seems related to the fact that full text was not moved out to another group. Moving it out fixed the above setup error on NodeA.
6. Verify setup
Just a casual check here using Management Studio (yes, the management tools get upgraded from Query Analyzer / Enterprise Manager to SSMS) and verify that we are indeed on SQL 2008 SP1:
Key Learnings
I hope you enjoyed this blog post and if you do like it, please do take a second to rate this post and also leave a few comments if you can. See you later!