Sharepoint Performance Blog Series Part 1

Where do I start with this performance issue?  HELP!

I’m going to write up several blogs in a Sharepoint Performance series to provide our customers with some insight on how to approach and troubleshoot these challenging issues.  This first one will pertain to what data Microsoft is looking for when you call in with a Sharepoint performance problem.
Performance issues can be extremely difficult, time consuming, frustrating, and expensive to troubleshoot and resolve.  Many times it involves an entire IT support team as well as several Microsoft engineers to determine two things. 


1.)    Attempt to identify source of performance pain point

2.)    Provide relief as quickly as possible 


Section 1:

What is the specific issue?

This is the first question that a Microsoft Support Engineer should ask.  The following are bad examples of customers reporting a performance problem:

A.)    My Sharepoint servers are extremely slow

B.)     My clients can’t connect to Sharepoint

You can see that these responses open up a lot of questions which can be time consuming to narrow down a proper scope of the specific issue.  

 

Here are two examples of customers calling into properly report a problem with a performance issue:

A.)   Pages take in excess of two minutes to render for all users accessing Sharepoint site collections\sites under Web Application named “Sharepoint Content-WebApp”.   

 B.)    Clients are unable to connect to Sharepoint sites during the hours of 2:00 PM – 3:00 PM.  Clients can connect to all Sharepoint sites outside of this time frame.  All web front ends appear to be effected. 

 

Section 2:

What data can I collect prior to calling into Microsoft for a Sharepoint performance problem?

I’ve had the above question asked by several customers.   The reason is because several hours are taken understanding issue, collecting data, and uploading data before Microsoft can even start looking into the problem.  By gathering the proper data prior to calling into Microsoft, several hours might be saved.  It’s important to understand that the type of performance issues encountered may result in Microsoft requesting additional data. 

 

Data to collect in no specific order:

1.)    SPS Report run on every WFE

 Download: http://www.codeplex.com/spsreport/Release/ProjectReleases.aspx?ReleaseId=5706

 

2.)     Collect a performance log during the time of the problem on a minimum of one Web Front-end.  A performance capture before, during, and after the problem is preferred.  I mention a minimum of one Web Front-end because it’s easier to grab data off of on server over many to start.  The question is which Web Front-end to I collect data from?  For example, Sharepoint pages are loading slowly on the client so a host file has been configured to go directly to a specific Web Front-End named WFE1 bypassing an NLB.  Pages are still loading slowly so you know that WFE1 is a good starting place to start collecting Performance logs.

Important!  Collect a baseline performance log now during healthy server performance for at least a 48 hour span of your highest user load on each Web Front-end.

Note:   Please see part 2 in this series (Coming Soon) for instructions on how to configure Performance Monitor log on Sharepoint servers.

 

3.)    Collect ULS logs off of every Web-Front end during the time of the problem.  The assumption is that the ULS logs are in a non-default directory.  If your ULS logs are in the default directory, you can ignore Step 3 because SPS Reports will grab them. 

You can find the directory path by performing the following steps:

A.)   Launch Central Administrator

B.)    Select Operations Tab

C.)    Select Diagnostic Logging

D.)   Trace log location will be near the bottom of the screen

Note:   We recommend the default logging as is.  It’s recommended not to enable verbose logging unless directed by Microsoft.

 

4.)    Synchronous unfiltered network traces of the problem. 

I prefer that customers use Netmon 3.2 to capture although it's not required.  So in the example before, pages are rendering extremely slow for a client.  To gather synchronous traces, we’ll need a network trace on a client and a specific Web Front-end at the same time during a client attempting to launch and render the page.  Again, one way to determine which Web Front-end a client is hitting is by configuring a host file on the client.  This assumes that an NLB exists which is usually the case.

 

5.)    Collect IISAPP before, during, and after performance.

IISAPP is a command line utility which will dump out the application pools and corresponding process ID’s (PID’s).   It’s extremely important to gather this because it’s one way were able to map the Application Pool to the PID within the Performance Log.   Also, by gathering more than one will tell us if the PID’s are changing which is an indication of w3wp process being recycled due to manual intervention or automated via application pool recycling.

 

Steps to run IISAPP:

a.)    Click Start, Run and type CMD
b.)    Type IISAPP

If prompted  -  proceed to step c. or  No prompt  -  proceed to step f

c.)    You might receive a popup which says

“This script does not work with Wscript” Click OK


d.)    You’ll should get a second prompt to register cscript which says:

“Would you like to register cscript as your default host for VBScript?”

e.)    You’ll get a successfully registered Cscript prompt  (Click OK)
f.)     Run IISAPP again and you should see the output to verify it works
g.)    To collect output for Microsoft, run the command and output to file.

(For example:  C:\>iisapp > c:\sharepointdata\iisappoutput.txt)

 

Additional Request:

Report the exact times you had or have the problem and provide some details around what led you to capture the above data set. 

I like the data organized so before calling in.  For example:  The performance logs compressed in one zip file and ULS logs zipped in another for example.