After installing Win7 RC on my production Media Center box (ie the box I literally can't watch TV without) I noticed an annoying problem.
The skip forward & back buttons didn't work!
At first I feared the issue was an intended "by design" change. After all, I'm sure the commercial TV networks don't like us skipping over their ads. Fortunately, it seems it's a compatibility issue between Win7 RC and my IR receiver (in case, Zalman HD160, IRTrans).
The solution:
Map the IR codes to the keyboard equivalents for the skip buttons via the following changes in %ProgramFiles%\IRTrans\remotes\apps.cfg.
[APP]MEDIACENTER
NEXT [KEY]\CTRLF
PREV [KEY]\CTRLB
A cool new feature in Windows7 & Server 2008 R2 is the ability to boot from a .vhd file!
Having your OS isolated to a .vhd file has a number of advantages including but not limited to:
- easier to backup,
- multi-boot scenarios that are more isolated from each other,
- ability to trial new builds on your physical hardware without trashing your existing installation.
Whilst there's a variety of ways that boot from .vhd can be achieved, here's the steps I've used. Note, I deliberately tried to avoid the installation GUI by doing part of the install via the WinPE command line. It's possible to automate the remainer of the installation process via an unattend.xml file. However, that's a potential topic for a future blog post.
1) Install the Windows AIK.
Windows® Automated Installation Kit (AIK) for Windows® 7 RC
http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=60a07e71-0acb-453a-8035-d30ead27ef72
2) Create bootable WinPE media (DVD or USB key) and include imagex.exe from the AIK on it.
Windows PE Walkthroughs
http://technet.microsoft.com/en-us/library/dd799278(WS.10).aspx
3) (optional) Include the install.wim from your desired Windows installation media on your WinPE media. Alternatively, I suppose you can always refer to the original installatiom media instead.
4) Boot off your WinPE media then run the following DISKPART commands (possible to script this via DISKPART /s)
select disk 0
clean
create partition primary size=100
format quick fs=ntfs label=System
assign letter=s
active
create partition primary
format quick fs=ntfs label=Data
assign letter=d
create vdisk file=d:\win7rc.vhd type=fixed maximum=10240
attach vdisk
select disk 1
create partition primary
format quick fs=ntfs label=OS
assign letter=c
exit
5) Extract the Windows image to your .vhd and copy the necessary boot file to your system partition.
imagex /apply w7rc_x86.wim 5 c:
bcdboot c:\windows /s s:
*** START OF UPDATE 7-Aug-2009 ***
You *might* need the following also. I don't recall doing this when I originally posted this however I needed this in a recent attempt.
bcdedit /store drive:\boot\bcd /set {guid} device vhd=[locate]\win7rc.vhd
bcdedit /store drive:\boot\bcd /set {guid} osdevice vhd=[locate]\win7rc.vhd
Where drive: is the system partition.
*** END OF UPDATE 7-Aug-2009 ***
exit
Your system will reboot and proceed with installation. For those that are keen to automate the remainder of the installation, here's a link to the relevant info:
Step-by-Step: Basic Windows Deployment for IT Professionals
http://technet.microsoft.com/en-us/library/dd349348(WS.10).aspx
I noticed SQL updates (KB960089 but also SQL SP3) were failing to apply on the WSS Developer VM.
After many hours of frustration, I located this great blog post that helped me to overcome the issue:
http://blogs.msdn.com/sqlserverfaq/archive/2009/01/30/part-1-sql-server-2005-patch-fails-to-install-with-an-error-unable-to-install-windows-installer-msp-file.aspx
In my case, the missing .msi's were from SQL Developer edition. The missing .msp's were from SQL SP2 & KB948109.
Here's the download location for SQL KB948109:
http://download.microsoft.com/download/7/c/7/7c7e394d-ddbf-4f2a-9e86-cf054e04931d/SQLServer2005-KB948109-x86-ENU.exe
Windows SharePoint Services 3.0 SP1 Developer Evaluation VPC Image
http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=1beeac6f-2ea1-4769-9948-74a74bd604fa
After installing all outstanding updates on the WSS Developer VM I started getting unexpected prompts for authentication.
First I tried disabling the loopback check as per KB896861. Unfortunately, no luck.
Seems the issue had to do with IE detecting the wrong zone. It was detecting the Internet zone whilst I'm clearly in the Intranet zone (ie URL was http://spvm).
Solution was to uncheck "Automatically detect intranet network" at:
Internet Options->Security tab->Local intranet->Sites button
Windows SharePoint Services 3.0 SP1 Developer Evaluation VPC Image
http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=1beeac6f-2ea1-4769-9948-74a74bd604fa
You receive error 401.1 when you browse a Web site that uses Integrated Authentication and is hosted on IIS 5.1 or IIS 6
http://support.microsoft.com/kb/896861
A web server is deemed to be unresponsive if it’s either not providing a response at all and/or it’s not achieving the response time (performance) expectations of the users.
In my “Troubleshooting 101” post, I mentioned that after the problem has been defined (ie basic facts collected), the next step in the troubleshooting process is to gather data relevant to diagnosing the issue. I’m planning to cover an introduction to analysing the data in a future post.
Following is a summary of my recommended action plan:
1) Ensure that the appropriate troubleshooting tools are available and/or optimally configured on the effected server(s)
2) Collect data capturing the current configuration of the server(s)
3) At the time of the problem, collect data necessary to capture the problem state
4) Provide the data gathered to an appropriate resource for analysis
Here’s some details for the aforementioned action plan (steps 1 & 2 are for now whilst the remainder of the action plan is for when the problem next occurs):
1) Troubleshooting tools/configuration:
a. Configure a binary format Performance System Monitor (aka Perfmon) counter log for the following objects. Choose a sample interval and other settings that are appropriate for your environment (ie disk space availability, etc):
.NET*
Active Server Pages
ASP.NET*
LogicalDisk
Memory
Network Interface
Paging File
PhysicalDisk
Process
Processor
System
Thread (this object can be particularly helpful when troubleshooting high cpu however there’s a reasonable additional overhead in including it so you might like to omit it for the initial data gathering attempt)
Web Service
* meaning all objects beginning with prefix.
By default, the process id (PID) doesn’t appear in the Perfmon Process instance names. The PID can be helpful during analysis of the Perfmon log so make the following registry change before you start the counter logging. Simply add a DWORD named “ProcessNameFormat” and give it a value of 2 under:
HKLM\SYSTEM\CurrentControlSet\Services\PerfProc\Performance
281884 The Process object in Performance Monitor can display Process IDs (PIDs)
http://support.microsoft.com/kb/281884
b. Ensure that you have time-taken enabled in the IIS logging for the effected website(s). Note, it’s not enabled by default and can be very useful as an objective measure of responsiveness.
c. Ensure that you have the “Debugging Tools for Windows” available on the effected server(s). Note, if you prefer, you can simply copy the “Debugging Tools for Windows” folder to the effected sever(s) rather than running the install:
Download “Debugging Tools for Windows”
http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx
2) Current configuration:
a. Gather general configuration information from the effected web server(s) via MPSReports:
MPSReports (MPSRPT_SETUPPerf.EXE):
http://www.microsoft.com/downloads/details.aspx?FamilyID=cebf3c7c-7ca5-408f-88b7-f9c79b7306c0&displaylang=en
b. Gather a copy of the IIS Metabase (%windir%\system32\inetsrv\Metabase.xml)
3) Capture the problem state:
a. Whilst the server is next considered unresponsive, capture hang dump(s) via the following command:
cscript.exe adplus.vbs -hang -iis -quiet -o <output path>
ADPlus comes with the “Debugging Tools for Windows” mentioned above in 1c).
More info:
How to use ADPlus to troubleshoot "hangs" and "crashes"
http://support.microsoft.com/kb/286350
b. (optional ) Repeat a). In some situations (eg high cpu), it can be helpful to capture a 2nd hang dump. The 2nd dump should only be initiated after the 1st dump has completed. Note, the 1st dump hasn’t completed when the .VBS is finished – it launches CDB.EXE instances so wait for them to conclude before initiating the 2nd dump. The 2nd dump can be helpful for determining which specific threads are responsible for the cpu usage, etc.
4) Gather the following and provide to an appropriate resource for analysis:
a. Perfmon logs covering the period leading up to the problem,
b. IIS logs covering the period leading up to the problem,
i. IIS activity log (windir%\system32\LogFiles\W3SVCx\*.log).
ii. HTTPERR log (%windir%\system32\LogFiles\HTTPERR\*.log).
c. Hang dump(s),
d. Event logs (both Application and System in .evt format). The event logs are included in the data gathered by MPSReports (step 2a) so you might find re-running MPSReports to be a convenient way to gather a copy of the event logs.
Troubleshooting is typically an iterative process. In other words, repeat steps 3-4 for each occurrence of the issue until resolution is achieved.
It’s been far too long between blog posts so here’s a post and a promise to blog more frequently...
I thought I’d share some of my thoughts on this topic that’s been the focus of my career for the past 7.5 years. I’ve tried to keep this as generic as possible.
Troubleshooting is somewhat of an art and a science. Fortunately, with a logical approach it can be more like a science and less like a modern art disaster! J Far too often I encounter situations where the chaos associated with a problem has clouded the judgement of those tasked with addressing it. From my experience, a structured, logical approach will achieve results faster than any other approach founded in matching the chaos of those affected by the problem. Talking about how important it is to resolve the problem, etc does nothing to address it. Only action will achieve the outcome desired by all involved.
At the risk of stating the obvious, a high-level overview of my troubleshooting approach is as follows:
1) Define the problem
2) Gather data
3) Analyse data
4) Implement potential solutions
5) Repeat 1)-4) until the problem is resolved
1) Define the problem
The initial and arguably most important step to troubleshooting is to define the problem that you’re hoping to overcome. After all, without a clear understanding of what you’re hoping to overcome/achieve you’ve got little hope. Some examples of the questions you should be asking include:
- What symptom(s) indicate that the problem occurred or is occurring?
- Are there multiple symptoms that can be attributed to the problem either at the time of the problem or even in the timeframe leading up to the problem?
- How do you know that the problem has occurred or is occurring?
- When did the problem first occur?
- When did the problem last occur?
- Approximately how frequently is the problem occurring?
- What action(s) are you taking to recover from the problem state when it occurs?
- Can you reproduce the issue at will? If so, what’s the steps necessary to do so?
Note, the above is not a definitive list. However, I hope it’s enough to give you an idea as to the type of questioning that should occurring before proceeding further down the troubleshooting path.
2) Gather data
Gather data that helps you to understand the problem. For example, the configuration of the effected environment, events leading up to the problem and the environment state at the time of the problem.
3) Analyse data
Invest time into thoroughly analysing the data that has been gathered. Leverage automated analysis tools where possible. Your goal should be to extract clues from the data that might help you to figure out the cause of the issue. Search whatever resources you have available to you (eg Internet) in attempt to locate others that have experienced the same/similar situations. You won’t always find others who’ve encountered exactly the same issue. However, you’re likely to find others who’ve faced something similar and you’re likely to learn from their journey.
4) Implement potential solutions
Potential solutions should be justified by observations from the data analysis and/or experience in the problem domain in general. It’s often necessary to promote your suggestions that are likely potential solutions as sometimes those in control are reluctant to risk any change to the effected environment. The reality is a change of some sort is likely to be necessary to resolve the issue so don’t be shy in regard to pushing the changes you feel are most likely to achieve the objective.
5) Repeat 1)-4) until the problem is resolved
Troubleshooting is often an iterative process. Don’t expect to “nail it” on your first attempt. You’ll often need to refine the action plan in response to the observations made during data analysis.
A problem is sometimes considered “resolved” if it is agreed that relief has be achieved. In other words, determining absolute root cause and/or fully addressing or understanding the reasons why the remedy has been successful is sometimes a luxury. Engineering types typically aren’t satisfied with an outcome unless the problem and it’s solution are fully understood. However, you’ll sometimes need to accept that your goal has been achieved when the problem is considered resolved by others.
It's often helpful to instrument your code to help with troubleshooting, etc. Instrumentation is really just a fancy word for tracing.
Here's an example of tracing to a file from ASP.NET.
<%@ Page Language="C#" CompilerOptions="/d:TRACE" %>
<script runat="server">
void Page_Load(object sender, EventArgs e)
{
System.Diagnostics.Trace.WriteLine(String.Format("{0},{1}", DateTime.Now, "Hello world!"));
}
</script>
<configuration>
<system.diagnostics>
<trace autoflush="true">
<listeners>
<add name="mytrace" type="System.Diagnostics.TextWriterTraceListener" initializeData="c:\temp\mytrace.csv" />
</listeners>
</trace>
</system.diagnostics>
</configuration>
If you'd like to avoid the CompilerOptions Page directive, an alternate technique is to add the following additional web.config entries. Note, this is equivilent to adding the directive to every page in your ASP.NET application.
<configuration>
<system.codedom>
<compilers>
<compiler language="c#;cs;csharp"
extension=".cs"
compilerOptions="/d:TRACE"
type="Microsoft.CSharp.CSharpCodeProvider, System, Version=2.0.3500.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" warningLevel="1" />
</compilers>
</system.codedom>
</configuration>
These tips are reasonably well-known and have been blogged by others. However, considering how often I come across these common “mistakes”, I felt yet another blog post was worthwhile:
1) Disable ASP.NET debugging in production!
I cannot emphasize this enough, Set debug=”false” in all your web.config’s. I’m regularly pleasantly surprised by how many production issues can be resolved by simply disabling ASP.NET debugging.
815157 HOW TO: Disable Debugging for ASP.NET Applications
http://support.microsoft.com/kb/815157
More info on the “evils” of having ASP.NET debugging enabled in production environments is discussed in the following blog post by one of my colleagues:
http://blogs.msdn.com/tess/archive/2006/04/13/asp-net-memory-if-your-application-is-in-production-then-why-is-debug-true.aspx
2) Store session state out of process
ASP.NET applications can encounter OutOfMemoryException’s (OOM’s) and other undesirable symptoms when .NET is struggling to allocate virtual memory. Busy sites that store session state in process are likely to encounter OOM’s, particularly if their recycling settings aren’t aggressive enough. Too aggressive recycling settings can cause problems also. Storing session state out of process is one of the simplest things you can do to avoid OOM’s and other issues associated with depletion of available virtual memory. Unfortunately, our default session state store is in process so many don’t consider moving session state out of process until they have a production issue. Storing session state out of process does have some implications (ie Serializable objects) so it’s worthwhile making this change early in your development cycle so you avoid the inconvenience of having to implement this change under the pressure associated with a production outage.
307598 INFO: ASP.NET State Management Overview
http://support.microsoft.com/kb/307598
3) Windows Server 2003 (or later)
Windows Server 2003 has a number of advantages over Windows 2000 for hosting ASP.NET applications:
a) You can isolate each web application/app domain to its own IIS6 app pool/worker process so they aren’t sharing the same per-process 2Gb virtual address space limitation.
b) You can leverage the various worker process recycling options available in 2003. Appropriate pre-emptive recycling is a great way to avoid production outages.
c) x64 version allows 4Gb of virtual address space even in 32-bit mode. Obviously, running out of virtual address space is much less likely if you have the luxury of running in 64-bit mode.
d) /3gb switch works even in the standard version.
4) .NET 2.0 (or later)
.NET 2.0 incorporates a significant number of improvements over .NET 1.x and most .NET 1.x sites will run under .NET 2.0 without modification. Note, .NET 2.0 isn’t installed by default on Windows Server 2003 so you’ll need to install it manually. If nothing else, the highly recommended threading configuration settings documented in KB 821268 are defaults in .NET 2.0.
821268 Contention, poor performance, and deadlocks when you make Web service requests from ASP.NET applications
http://support.microsoft.com/kb/821268
5) Increase the allowed number of outbound HTTP connections from the default (2).
By default, in order to be a compliant HTTP 1.1 client, .NET only allows 2 outbound HTTP requests per process/host. Whilst this might be appropriate if you’re building a Winforms app, it can be a bottleneck for server-side applications like ASP.NET websites that are doing web service calls, etc. You can increase the allowed number of outbound HTTP connections via the maxconnection .config setting. KB 821268 recommends 12 * N where N is the number of logical CPU’s in your ASP.NET server.
I've been working with a few customers lately that have been experiencing this WebException that has been confirmed as an issue introduced by connection management design changes incorporated into .NET 2.0. The exception typically ocurrs during a web service call however any scenario involving a HttpWebRequest with keep-alives could encounter this issue.
Note, this issue doesn't occur in .NET 1.x so if you are experiencing this exception in any .NET version prior to .NET 2.0 RTM then I suggest that you are probably dealing with a different issue so KB 915599 is probably a good place to start.
The issue involves how we handle the scenario whereby a server has sent a FIN to a keep-alive connection. We should close the connection and issue the next client request on a new connection. However, in .NET 2.0 RTM, we attempt to reuse the original connection resulting in the server resetting the connection on the next request. This reset results in this exception being raised in the client.
The issue is currently scheduled to be resolved in .NET 3.5. However, we are also considering the possibility of a hotfix for .NET 2.0.
If you suspect that you're hitting this issue, you’ll need to capture Netmon trace(s), ideally from both ends of the network conversation, whilst reproducing the issue. Generally speaking, it’s preferable to capture network traces from both ends of a conversation as this will allow you to determine if there is any possibility that intermediaries (ie proxy servers, networking hardware) are influencing the conversation.
If you’d like to minimise the size of the trace file, you can use a capture filter.
Resolutions D, E & F from KB 915599 are potential workarounds for you to consider. However, resolution D (ie disabling keep-alives) won’t be an option for you if you are using NTLM authentication as it requires keep-alives.
*** UPDATE *** The hotfix for .NET 2.0 is available via contacting Microsoft Customer Support Services and quoting KB 941633. http://support.microsoft.com/kb/941633
This is a handy tip to reduce the file size of your Netmon traces. This is particularlly useful when you need to leave the trace running for an extended period of time. Thanks go to my collegue Andreja Rusjakovski for this tip...
Just before starting the trace go to Capture->Filter->Load button and select a *.cf file. An example of the contents of the .cf file for a Port 80 only trace is as follows. Note, to capture a different port substitue the "0050" for a different value (ie "0050" is hexadecimal for decimal 80).
[CAPTURE FILTER]
VERSION=2
[SAPS ETYPES]
SAPS=1
ETYPES=1
[ADDRESSES]
NLINES=0
[ANDEXP1]
PATTERN1=0, 22, 2,0050
PATTERN2=0, 24, 2,0050
1) Download and install Netmon from the following URL. Note, the password for the .zip is "trace".
ftp://ftp.microsoft.com/pss/tools/netmon/netmon2.zip
2) Start Netmon.
Administrative Tools->Network Analysis Tools->Network Monitor
3) Select the appropriate network interface.
The first time you run Netmon, you'll be asked to select the network interface to trace. The following command from the command line should help you to identify the approriate interface via the "Physical Address":
ipconfig /all
4) Increase the buffer settings.
By default, Netmon will only trace up to 1Mb of data before it starts to overwrite the capture buffer. Set the buffer to a larger size (say 10Mb) via Capture->Buffer Settings menu item.
5) Start the trace via Capture->Start menu item.
6) Reproduce the issue.
7) Stop the trace via the Capture->Stop menu item.
8) Save the trace via the File->Save As menu item.
9) To complement the trace, capture network configuration information to a .txt file:
ipconfig /all > %computername%-ipconfig.txt
10) .ZIP up the .cap and .txt files and send to your Microsoft support representative for analysis.