Welcome to MSDN Blogs Sign in | Join | Help

interfacing SharePoint with Akamai

 

 

 

What is Akamai?

Think of Akamai as a very intelligent, caching, reverse proxy server, which sits in front of your SharePoint farm. External user requests go to Akamai servers first. If Akamai has the content cached, it serves the content directly back to the client; otherwise Akamai requests the content from your SharePoint farm and then sends it to the client. Akamai does intelligent caching; that is, it can cache static portions of a page like images and stylesheets, while still getting the dynamic parts of the page from your SharePoint server. In any case, the client never directly hits your SharePoint farm.

Akamai is much more than just a reverse proxy server. Akamai has many farms of servers (at last report, more than 20,000 servers) distributed in 70 countries. Akamai has a dynamic mapping system, which uses heuristics and network performance historical data, to route the client request to the optimal Akamai farm.

Your SharePoint farm is referred to as the “origin farm” in Akamai literature. The basic content flow is illustrated in this table.

 

User browser makes request.

 

 

 

=>

 

 

Request routed to optimal Akamai farm

 

 

Akamai (Edge Server) checks cache, requests non-cached content.

 

 

=>

 

 

Request non-cached content

 

 

Origin (SharePoint Farm) generates non-cached or dynamic content

 

 

Client browser renders Akamai’s response

 

 

 

<=

 

 

Akamai combines origin’s response with locally cached content

 

 

 

<=

 

 

Origin responds with content

 

 

 

Akamai is the only external user your origin farm will see. Due to this, blob and output caching is usually of limited benefit when using Akamai.

Working with Akamai

You contract with Akamai for its services. Akamai will assign a representative who will assist you in setting up origin farm to communicate with Akamai’s edge servers, and in configuring the Akamai edge server to meet your application’s needs. Akamai provides training and support as needed.

You should plan on several weeks to get everything working.

The high level steps are:

  1. If using SSL, install an SSL certificate on your origin farm, and provide Akamai the certificate information so Akamai can request and install SSL certificates on the Akamai edge servers.
  2. Create a DNS record for your origin farm. The naming convention is your origin URL prefixed with “origin-“; for example, “origin-www.litwareinc.com”. Akamai will take over your URL; that is, www.litwareinc.com. Akamai refers to your URL as a digital property. Your contract with Akamai can include multiple digital properties; for example, if you are exposing multiple SharePoint web applications through Akamai.
  3. Configure the Akamai edge server behavior.  The Akamai representative will provide you credentials to access the configuration console. Some of the more important configuration parameters are:

a.     Whether to use SSL.

b.    The origin farm URL and port number if non-standard.

c.     Cache key format, and whether to ignore case when comparing cache keys.

d.    Compression to use when communicating with the origin farm.

e.     Time to Live (TTL) rules, which determine what types of content Akamai should cache, and for how long.

f.     Prefetch rules.

4.     Testing the Akamai configuration. Akamai has extensive reports to help you identify and tune page performance by adjusting the caching rules and other edge server settings.

5.     Go Live. You can access the Akamai portal to periodically review performance reports, so you can adjust edge server configuration as needed.

SharePoint Publishing – Cache Invalidation

Akamai is most beneficial in content publishing scenarios; that is, where most users are anonymous and the content is mostly static. This scenario makes the caching provided by Akamai most effective. This translates to a SharePoint publishing portal. Since edge server cache TTL should be as long as possible, there needs to be a way to notify the edge server when a new page version has been published; thereby forcing the Akamai cached version of the page to be refreshed. Akamai provides the Content Control Utility (CCU) web service for this purpose.

The CCU is a SOAP web service that allows you to specify the refreshing of specific cached objects, or to remove specific objects. The CCU provides the option of using invalidation-based or removal-base refreshing. Requests are propagated through the Akamai network, and most removals are completed within 10 minutes of the request. One limit is that files submitted for CCU requests should contain no more than about 100 URLs per SOAP request. Your Akamai representative will give you the end-point URL for this web service, and a user name and password required for authentication to the web service.

A typical SharePoint configuration is to have an Authoring farm when content pages are created and edited. Content deployment jobs periodically push new page versions to the Public farm, which is the origin farm. To automate the CCU web service calls, you can install event handles in the Public farm. The event handlers call the CCU web service for new, changed, and deleted pages. The architecture might look like the following diagram.

akamai1

Event receivers are installed on all libraries and lists which contain content that can be changed by authors. This includes modifications by Content Deployment jobs, or manual copying/deleting of assets in the Public farm. The asynchronous (after) events are used to minimize performance impacts.  The event handler is installed the Global Assembly Cache (GAC). This allows it to be called from any site in the Public farm. The event handler packages the URLs and then sends them to Akamai’s CCU web service.

These events are captured by the event handlers:

¾  ItemDeleted

¾  ItemUpdated

By default these libraries and lists in the public portal root site collection, and every subsite, should have event receivers installed. Additional lists and libraries may be added as needed.

¾  Pages Library

¾  Site Collection Image Library

¾  Site Collection Documents Library

¾  Style Library

¾  Images Library

¾  Reusable Content List

A SOAP request is formatted according the Akamai’s Content_Control_Interfaces.pdf document. The Akamai user name and password are stored in SharePoint’s single sign-on repository for protection. SharePoint object model calls are used to retrieve the user name and password to make the web service call. The invalidate action is specified to minimize impact if the cache does not truly need to be updated. The response code is checked to determine if a retry is necessary.

It is most effective to use asynchronous methods to call the CCU method, to prevent excessive blocking if the CCU web service is slow in responding. The async call might look like this picture.

akamai2

The async response handler would look like this picture:

akamai3

Conclusion

Akamai can provide a huge performance boost for public portals serving global audiences. Most of the effort to set up Akamai is transparent to SharePoint. The exception is the need to forcefully update the Akamai cache when new content is published. Cache refreshing can be forced by calling the CCU web service provided by Akamai.

Posted by jimmiet | 0 Comments
Filed under:

IIS 6.0 Compression

Introduction

Overview – What Problem Does IIS Compression Solve?

Total response time is composed of 3 major components. This can be expressed as a formula:

User response time = server processing time + network transmission time + client rendering time

The network transmission time can be a major component for remote users accessing SharePoint over WAN links. Reducing the number of bytes transmitted can reduce the network time. IIS Compression can accomplish this reduction in the number of bytes transmitted.

IIS compression is highly configurable. Compression can be scoped to:

  1. Static, dynamic, or both static and dynamic content.
  2. Entire web sites, specific directories, or even individual files.
  3. Specific file types based upon file extension.

The compression level is a number between 0 and 10, where 10 is the greatest compression. More CPU resources are required for higher compression numbers.

When SharePoint installed, IIS is configured to compress both static and dynamic files.

By default static compression is on the file types: HTM, HTML, and TXT at level 10.

By default static compression is on the file types: ASP and EXE at level 0.

Static verses Dynamic Compression

Compressed static responses are cached to disk. Once the compressed page is cached, there is no further CPU overhead until the cache expires. Static compression can have dramatic effects; for example: core.js is 257 KB on disk, but IIS static compression reduces it to 54 KB.

Dynamic compression requires trial-and-error testing to find the optimal settings. Dynamic compression can affect CPU resources because IIS does not cache compressed dynamic output. If compression is enabled for dynamic responses and IIS receives a request for a file that contains dynamic content, the response that IIS sends is compressed every time it is requested.

Because dynamic compression consumes considerable CPU time and memory resources, use it only on servers that have underutilized CPUs. If web site generates a large volume of dynamic content, consider whether the additional processing cost of HTTP compression can be reasonably afforded. If the % Processor Time counter is already 80 percent or higher, enabling HTTP compression is not recommended.

To evaluate how much of your processor is typically being used, follow these steps:

  • Establish a baseline for your processor usage by using System Monitor to log the following counters over several days. If you use Performance Logs and Alerts, you can log the data to a database and then query the data, examining the results in detail.
    • Processor\% Processor Time. This counter has a total instance and a separate instance for each processor in the system. If your server has more than one processor, you need to watch the individual processors as well as the total to discover any imbalance in the workload.
    • Network Interface\Bytes Sent/sec. Counters for the Network Interface performance object display the rate at which bytes are transmitted over a TCP/IP connection by monitoring the counters on the network adapter.
  • Enable compression, and continue to log the values for these counters for an extended period — preferably for several days — so you have a good basis for comparison. Collect a broad sample to determine how compression affects various aspects of performance. Conduct the following tests:
  • Try variations of compression settings.
    • Enable static compression only, dynamic compressions only, and both.
    • Change the list of files that you use for compression testing for both static and dynamic content.
    • Vary the compression level.

Use the performance logs to determine the sweet spot at which Network Interface Bytes is reduced the most while % Processor Time remains below 80%.

Global, Web Site, Directory, File Scoping

Both static and dynamic compression can be configured at multiple scopes by scripting or by using the Metabase GUI tool.

The script C:\Inetpub\AdminScripts\adsutil.vbs is the recommended approach for configuration. This allows automated scripts to be applied to all web servers.

Gzip and Deflate

There are 2 types of compression: gzip and deflate. Gzip is actually a superset of deflate. Both should be configured the same (same types of files and same level of compression), so browsers using either compression method with get similar results.

Recommended Initial Configuration

These are the recommended starting values for gzip and deflate compression. These values can be adjusted based upon performance counter captures, to optimize the network bytes transmitted while keeping the CPU load within reasonable bounds.

HcDoStaticCompression : (BOOLEAN) True
HcDoDynamicCompression : (BOOLEAN) True
HcOnDemandCompLevel : (INTEGER) 10
HcDynamicCompressionLevel : (INTEGER) 0
HcScriptFileExtensions : (LIST) (6 Items)
"asp"
"exe"
"axd"
"ascx"
"asmx"
"aspx"
HcFileExtensions : (LIST) (5Items)
"htm"
"html"
"txt"
"css"
"js"

Configuring IIS 6.0

Open a command prompt.

Change directory to C:\Inetpub\AdminScripts

Directory of C:\Inetpub\AdminScripts
04/03/2007 05:46 PM <DIR> .
04/03/2007 05:46 PM <DIR> ..
11/13/2006 10:08 AM 85,813 adsutil.vbs
02/21/2003 08:48 PM 6,064 synciwam.vb
2 File(s) 91,877 bytes
2 Dir(s) 43,512,107,008 bytes free
C:\Inetpub\AdminScripts>

Global Configuration

Capture the current settings for a historical record.

>cscript adsutil.vbs enum W3Svc/Filters/Compression/Parameters
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
KeyType : (STRING) "IIsCompressionSchemes"
HcCompressionDirectory : (EXPANDSZ) "%windir%\IIS Temporary Compressed Files"
HcCacheControlHeader : (STRING) "max-age=86400"
HcExpiresHeader : (STRING) "Wed, 01 Jan 1997 12:00:00 GMT"
HcDoDynamicCompression : (BOOLEAN) False
HcDoStaticCompression : (BOOLEAN) True
HcDoOnDemandCompression : (BOOLEAN) True
HcDoDiskSpaceLimiting : (BOOLEAN) True
HcNoCompressionForHttp10 : (BOOLEAN) True
HcNoCompressionForProxies : (BOOLEAN) True
HcNoCompressionForRange : (BOOLEAN) False
HcSendCacheHeaders : (BOOLEAN) False
HcMaxDiskSpaceUsage : (INTEGER) 99614720
HcIoBufferSize : (INTEGER) 8192
HcCompressionBufferSize : (INTEGER) 8192
HcMaxQueueLength : (INTEGER) 1000
HcFilesDeletedPerDiskFree : (INTEGER) 256
HcMinFileSizeForComp : (INTEGER) 1
>cscript adsutil.vbs enum W3Svc/Filters/Compression/DEFLATE
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
KeyType : (STRING) "IIsCompressionScheme"
HcDoDynamicCompression : (BOOLEAN) True
HcDoStaticCompression : (BOOLEAN) True
HcDoOnDemandCompression : (BOOLEAN) True
HcCompressionDll : (EXPANDSZ) "%windir%\system32\inetsrv\gzip.dll"
HcFileExtensions : (LIST) (3 Items)
"htm"
"html"
"txt"
HcScriptFileExtensions : (LIST) (2 Items)
"asp"
"exe"
HcPriority : (INTEGER) 1
HcDynamicCompressionLevel : (INTEGER) 0
HcOnDemandCompLevel : (INTEGER) 10
HcCreateFlags : (INTEGER) 0
>cscript adsutil.vbs enum W3Svc/Filters/Compression/GZIP
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
KeyType : (STRING) "IIsCompressionScheme"
HcDoDynamicCompression : (BOOLEAN) True
HcDoStaticCompression : (BOOLEAN) True
HcDoOnDemandCompression : (BOOLEAN) True
HcCompressionDll : (EXPANDSZ) "%windir%\system32\inetsrv\gzip.dll"
HcFileExtensions : (LIST) (6 Items)
"htm"
"html"
"txt"
HcScriptFileExtensions : (LIST) (3 Items)
"asp"
"exe"
HcPriority : (INTEGER) 1
HcDynamicCompressionLevel : (INTEGER) 0
HcOnDemandCompLevel : (INTEGER) 10
HcCreateFlags : (INTEGER) 1

Enable a setting; for example, turn on global static compression if it is currently off:

>cscript adsutil.vbs set w3svc/filters/compression/parameters/HcDoStaticCompression true
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcDoStaticCompression : (BOOLEAN) True

Add CSS and JS file types to the static compressed file types. CSS (Cascading Stylesheets) and JS (JavaScript) will provide the most performance gains with SharePoint

>cscript adsutil.vbs SET W3SVC/Filters/Compression/Deflate/HcFileExtensions "htm" "html" "txt" "css" "js"
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcFileExtensions : (LIST) "htm" "html" "txt" "css" "js"
 
>cscript adsutil.vbs get W3SVC/Filters/Compression/Deflate/HcFileExtensions
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcFileExtensions : (LIST) (1 Items)
"htm "html" "txt" "css" "js"
 
>cscript adsutil.vbs SET W3SVC/Filters/Compression/gzip/HcFileExtensions "htm" "html" "txt" "css" "js"
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcFileExtensions : (LIST) "htm" "html" "txt" "css" "js"
 
>cscript adsutil.vbs GET W3SVC/Filters/Compression/gzip/HcFileExtensions
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcFileExtensions : (LIST) (1 Items)
"htm" "html" "txt" "css" "js"

Add AXD, ASMX, & ASPX to dynamic file list. Tip: Don't compress JPG/JPEG images (already compressed).

>cscript adsutil.vbs SET W3Svc/Filters/Compression/DEFLATE/HcScriptFileExtensions "asp" "exe" "axd" "aspx"
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcScriptFileExtensions : (LIST) "asp" "exe" "axd" "aspx" "asmx"
 
>cscript adsutil.vbs GET W3Svc/Filters/Compression/DEFLATE/HcScriptFileExtensions
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcScriptFileExtensions : (LIST) (5 Items)
"asp"
"exe"
"axd"
"asmx"
"aspx"
 
>cscript adsutil.vbs SET W3Svc/Filters/Compression/GZIP/HcScriptFileExtensions "asp" "exe" "axd" "apx"
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcScriptFileExtensions : (LIST) "asp" "exe" "axd" "aspx" "asmx"
 
>cscript adsutil.vbs GET W3Svc/Filters/Compression/GZIP/HcScriptFileExtensions
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcScriptFileExtensions : (LIST) (4 Items)
"asp"
"exe"
"axd"
"asmx"
"aspx"

Consider testing the impact of varying compression levels in a laboratory environment closely monitoring CPU utilization and potential impact to the Web servers. Typically a compression level between 7 and 9 provides optimum performance vs. CPU load in most circumstances. Tip: Start with dynamic compression level set to 4, and then try increasing to see the CPU impact.

>cscript adsutil.vbs SET W3Svc/Filters/Compression/DEFLATE/HcDynamicCompressionLevel 4
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcDynamicCompressionLevel : (INTEGER) 4
 
>cscript adsutil.vbs GET W3Svc/Filters/Compression/DEFLATE/HcDynamicCompressionLevel
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcDynamicCompressionLevel : (INTEGER) 4
 
>cscript adsutil.vbs SET W3Svc/Filters/Compression/GZIP/HcDynamicCompressionLevel 4
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcDynamicCompressionLevel : (INTEGER) 4
 
>cscript adsutil.vbs GET W3Svc/Filters/Compression/GZIP/HcDynamicCompressionLevel
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
HcDynamicCompressionLevel : (INTEGER) 4

Web Site Scope Configuration

In some cases you may wish to enable or disable compression at only the site or site element level as opposed to global level. Use the path to the web site in the adsutil command line. You can determine the site path using the IIS Metabase Explorer application available in the IIS 6.0 Resource Kit, or by enumerating sites using the adsutil.vbs script.


>cscript adsutil.vbs enum w3svc
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
KeyType : (STRING) "IIsWebService"
MaxConnections : (INTEGER) 4294967295
AnonymousUserName : (STRING) "IUSR_EPGOPSR2BASE"
AnonymousUserPass : (STRING) "**********"
ConnectionTimeout : (INTEGER) 120
AllowKeepAlive : (BOOLEAN) True
DefaultDoc : (STRING) "Default.htm,Default.asp,index.htm"
HttpCustomHeaders : (LIST) (1 Items)
 
"X-Powered-By: ASP.NET"
 
(many lines removed here)
 
[/w3svc/1513483211]
[/w3svc/1669737538]
[/w3svc/1720207907]
[/w3svc/2004785039]
[/w3svc/809964160]
[/w3svc/941433650]
[/w3svc/AppPools]
[/w3svc/Filters]
[/w3svc/Info]

You can match the w3svc web site numbers to the web site name using the IIS Manager MMC. In the following picture we see the collaboration web site is web site ID 809964160.

If we assume the collaboration web site is only accessed by users on the local LAN where network bandwidth is not an issue, we can disable dynamic compress for just this web site by using the web site metabase path (w3svc/809964160/root/) to set the DoDynamicCompression parameter.

>cscript adsutil.vbs SET w3svc/809964160/root/DoDynamicCompression false
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.
 
DoDynamicCompression : (BOOLEAN) False

Restart IIS

After completing the configuration changes, always restart IIS. At a command prompt, either enter

IISReset

Or

NET STOP W3SVC
NET START W3SVC

Example Improvement with Compression

With static compression set to 10, and dynamic compression set to 4, Fiddler captured these statistics for a publishing site home page. The round trip cost and elapsed time values are estimates based upon typical network latency.

Request Count:     43
Bytes Sent:     24,716
Bytes Received: 218,719
 
ACTUAL PERFORMANCE
--------------
Requests started at:    08:37:02:5221
Responses completed at:    08:37:05:0157
Total Sequence time:    00:00:02.4935856
 
RESPONSE CODES
--------------
HTTP/401:     8
HTTP/200:     35
 
RESPONSE BYTES (by Content-Type)
--------------
image/gif:    30,479
text/css:    17,003
~headers:    18,432
image/jpeg:    13,634
text/html:    28,335
application/x-javascript:    110,836
 
ESTIMATED WORLDWIDE PERFORMANCE
--------------
The following are VERY rough estimates of download times when hitting servers based in WA, USA.
 
 
US West Coast (Modem - 6KB/sec)
---------------
Round trip cost: 4.30s
Elapsed Time:     44.30s
 
Japan / Northern Europe (Modem)
---------------
Round trip cost: 6.45s
Elapsed Time:     46.45s
 
China (Modem)
---------------
Round trip cost: 19.35s
Elapsed Time:     59.35s
 
US West Coast (DSL - 30KB/sec)
---------------
Round trip cost: 4.30s
Elapsed Time:     12.30s
 
Japan / Northern Europe (DSL)
---------------
Round trip cost: 6.45s
Elapsed Time:     14.45s
 
China (DSL)
---------------
Round trip cost: 19.35s
Elapsed Time:     27.35s
 

These results are for the same page with static and dynamic compression disabled.

Request Count:     39
Bytes Sent:     21,626
Bytes Received: 772,669
 
ACTUAL PERFORMANCE
--------------
Requests started at:    08:41:59:8797
Responses completed at:    08:42:25:8069
Total Sequence time:    00:00:25.9272816
 
RESPONSE CODES
--------------
HTTP/401:     2
HTTP/200:     37
 
RESPONSE BYTES (by Content-Type)
--------------
image/gif:    30,565
text/css:    103,482
~headers:    15,093
image/jpeg:    13,634
text/html:    72,976
application/x-javascript:    536,919
 
ESTIMATED WORLDWIDE PERFORMANCE
--------------
The following are VERY rough estimates of download times when hitting servers based in WA, USA.
 
US West Coast (Modem - 6KB/sec)
---------------
Round trip cost: 3.90s
Elapsed Time:     135.90s
 
Japan / Northern Europe (Modem)
---------------
Round trip cost: 5.85s
Elapsed Time:     137.85s
 
China (Modem)
---------------
Round trip cost: 17.55s
Elapsed Time:     149.55s
 
US West Coast (DSL - 30KB/sec)
---------------
Round trip cost: 3.90s
Elapsed Time:     29.90s
 
Japan / Northern Europe (DSL)
---------------
Round trip cost: 5.85s
Elapsed Time:     31.85s
 
China (DSL)
---------------
Round trip cost: 17.55s
Elapsed Time:     43.55s

Note the following differences:

Description

With Compression

Without Compression

Bytes Received:

218,719

772,669

RESPONSE BYTES (by Content-Type)

image/gif:    30,479

text/css:     17,003

~headers:    18,432

image/jpeg:    13,634

text/html:    28,335

application/x-javascript:    110,836

image/gif:    30,565

text/css:     103,482

~headers:    15,093

image/jpeg:    13,634

text/html:    72,976

application/x-javascript:    536,919

Japan / Northern Europe (DSL)

Round trip cost: 6.45s

Elapsed Time:     14.45s

Round trip cost: 5.85s

Elapsed Time:     31.85s

China (DSL)

Round trip cost: 19.35s

Elapsed Time:     27.35s

Round trip cost: 17.55s

Elapsed Time:     43.55s

References

 

IIS 6.0 Resource Guide: |
http://www.microsoft.com/downloads/details.aspx?FamilyID=80a1b6e6-829e-49b7-8c02-333d9c148e69&DisplayLang=en

Using Granular Compression in IIS 6.0 Webcast: http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032270805&EventCategory=3&culture=en-US&CountryCode=US

Metabase Property Reference (IIS 6.0)
http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/c63788cc-70b4-4a44-a9a3-329fa8fb3afb.mspx?mfr=true

Using HTTP Compression for Faster Downloads (IIS 6.0)
http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/25d2170b-09c0-45fd-8da4-898cf9a7d568.mspx?mfr=true

HTTP Compression, Internet Information Services 6.0, and SharePoint Products and Technologies
http://blogs.technet.com/wbaer/archive/2008/01/30/http-compression-internet-information-services-6-0-and-sharepoint-products-and-technologies.aspx

Posted by jimmiet | 0 Comments

ACL Limitations

 The Problem

Before talking about SharePoint, it is necessary to talk about the Windows operating system. Security authorization is based on Access Control Lists, or ACLs. An ACL is a list of access control entries (ACE). Each ACE identifies a security principal; such as a user or AD group, and the access rights allowed, denied, or audited for that security principal. The original designers of the Windows OS set a maximum size of 64K bytes for ACLs, which at the time seemed much more that would be needed.

Operating System ACLs

An obvious question is how many users or AD groups can an ACL contain? The answer is, “It depends”. Since an ACL is composed of ACEs, the question becomes “How many ACEs can an ACL contain”? An ACE is a variable length structure. The variable part is the user or group’s security ID, or SID. The security identifier (SID) structure is a variable-length structure used to uniquely identify users or groups. The variability is driven by the domain topology in which the SID was created and assigned. A deeply nested forest domain structure will result in longer SIDs than a flat domain structure for example. Given this variability, the commonly heard numbers of ACEs in an ACL range from 1,000 to 2,000.

SharePoint ACLs

Up to this point, we have been talking about operating system ACLs. These types of ACLs can impact search crawling (discussed later), but not authorization within SharePoint itself.

SharePoint has its own type of ACL it uses for authorization within SharePoint. SharePoint uses its ACLs to make access decisions; as well as, UI trimming decisions. SharePoint ACEs are optimized for SharePoint’s use cases; hence the format is simpler than operation system ACEs. In SharePoint, every user or AD group explicitly added to a site collection gets an entry in the site collection’s UserInfo table, is assigned a principal ID (PID), and gets an entry in the Perms table.

The Perms table contains SharePoint’s ACLs, which are composed of SharePoint ACEs. A SharePoint ACE consists of the PID, and an 8 byte permissions mask. So if inheritance is NOT broken, every user or NT group given access to a site collection has one row in the UserInfo table and one row in the Perms table. If a user which receives access indirectly through a NT group accesses the site, a row is added to the UserInfo table to contain that user’s display name, email address, and depending upon the operations that user performs on the site, a copy of their operating system SID; but no new row is added to the Perms table since the original row for the user’s group is sufficient to define the SharePoint permissions for that user.

The Crawl ACL Translation Problem

The SharePoint search crawler keeps a copy of an object’s ACL in the search database to enable security trimming at search query time. To keep things simple and efficient for query processing, the search crawler translates SharePoint ACLs into operating system ACLs when building the index. Herein lays a problem. If the translated SharePoint ACL generates an operating system ACL larger than 64K, the search crawl throws an error, “parameter is incorrect”. If this error occurs at the site collection level, none of that site collection’s content will be indexed. This results in a situation in which the site collection itself works fine, but none of the site collection content can be searched.

How Can ACLs > 64K Happen?

One way is to explicitly add enough individual users or AD groups to a site collection’s membership so that the mapping of the SharePoint ACL to operating system ACL during search crawling results in an operating system ACL greater than 64K. The first section of this post says this will happen between 1,000 and 2,000 unique users or AD groups, depending upon the size of each user or group’s SID. While unlikely, this is a possibility.

The other, more likely way, is through broken permission inheritance within a site collection.

What’s Wrong with Broken Inheritance?

Let’s walk through a common scenario. A site collection is created for a large community of users; for example, a sales and marketing group with 5,000 members. Using best practice, we create a Sales and Marketing AD group, add the 5,000 users either directly or by nesting already existing AD groups. We then grant this AD group access to the site collection. The end result is one row in the UserInfo table and one row in the Perms table. Both the SharePoint ACL and mapped operating system ACL are less than 100 bytes long. Life is good; search indexing is happy and efficient.

We then start loading sales and marketing documents into an array of subsites and document libraries. It becomes apparent one subsite is for marketing research. We only want 25 people to have access. What do we do? We break inheritance with the site collection and give those 25 people explicit access to the marketing research subsite. We now have at least 25 rows in the UserInfo table, but more importantly, 25 additional rows in the Perms table.

Let’s say this scenario repeats itself for subsites divided by product line and country/region; since it is important to keep product strategy restricted. Maybe we break inheritance for 10 other subsites of 25 users each.  The net result is an additional 250 rows in the Perms table.

Now comes the killer scenario. We add another AD group to site collection member ship which includes all full time employees (FTEs); containing a total of 30,000 members, either directly or through nested AD group membership. This group has read access to all sales and marketing content except the previously mentioned subsites with broken inheritance. This action by itself only adds one more row to the Perms table, which is not a problem; however, the plot thickens…

It is determined that certain FTEs in manufacturing and other support organizations need contributor access to specific documents scattered throughout the sales and marketing site collection. One-by-one, inheritance is broken on specific documents, and individual FTE users are granted contributor access to just those documents.  Each time this happens, rows are added to the Perms table. Over time, perhaps 5 unique FTE users are granted broken inheritance access to 300 different individual documents, adding 1,500 rows to the Perms table. Broken inheritance has become a growing cancer.

Eventually the Perms table reaches 1,800 rows. The Perm row for the site collection itself now contains 1,800+ ACEs. The incremental crawl that night starts throwing “parameter is incorrect” errors while crawling the site collection. Why? Because the operating system ACL the search crawling is trying to generate from the 2,000+ ACEs in the site collection Perms table exceeds the 64K ACL max size.

What Can You Do To Protect Yourself?

Plan ahead when designing the information architecture (IA). Besides the number of AD groups and users explicitly granted membership in your site collections, consider what would have if you need to start breaking inheritance for granular access control at the list, library, or individual document level. Several customers have already hit problem.

I know of one major manufacturer whose Extranet supply chain site is based on one site collection. As more-and-more broken inheritance was applied to individual documents to keep each supplier’s information private to that supplier, the site collection has been unsearchable.

The only after-the-fact solution is painful refactoring of the site collection structure to more granularly designed site collections, whose usage is carefully aligned with the intended user communities.

 

Posted by jimmiet | 0 Comments

Office Server Web Service Authorization

Office Server Web Service Authorization

I recently encountered a mystery. I was testing a newly written utility that called the object model method Content.ContentSources to get a list of the search content sources. The utility functioned perfectly on my single server virtual development farm. When it came time to deploy to a medium farm, the utility was installed on the Central Administration server, which was separate from the Index server. The utility stopped working! (This is why you should always test in a multiple server environment). Now I had to figure out what was causing the failure.

The root cause was Office Server Web Service security. When the utility ran on a single server, SharePoint automatically made direct object model calls (known as short-circuiting web service calls, to optimize performance), so the SearchAdmin.asmx web service was not called. When the utility ran on a separate server, SharePoint under-the-covers converted the object model call to a web service call to the Index server.

Problem 1: SSL Failure

The first problem was quickly discovered in the Application Event Log. There was an SSL failure reported. The Office Server Web Service by default uses SSL to secure intra-farm communications. A search of KB articles found KB962928 (http://support.microsoft.com/?id=962928). This article matched my scenario. The .NET Framework 3.5 SP1 had just been installed on the farm. This installation can corrupt the SSL certificate used by the Office Server Web Service. You can refer to the KB article for details. In summary the fix is to run SelfSSL.exe found in the IIS 6.0 Resource Kit. This must be done on all servers in the farm.

To run SelfSSL.exe you need to know the Office Server Web Service identifier, which you can find in the IIS Manager MMC. It is 1720207907, as seen in the following picture.

cdwp1

Run the command as explained in the KB article on every farm server, using the correct Identifier value. A sample command session follows:

 

C:\Program Files\IIS Resources\SelfSSL>net stop osearch

The Office SharePoint Server Search service is stopping.

The Office SharePoint Server Search service was stopped successfully.

 

 

C:\Program Files\IIS Resources\SelfSSL>selfssl /s:1720207907 /v:1000

Microsoft (R) SelfSSL Version 1.0

Copyright (C) 2003 Microsoft Corporation. All rights reserved.

 

Do you want to replace the SSL settings for site 1720207907 (Y/N)?y

The self signed certificate was successfully assigned to site 1720207907.

 

C:\Program Files\IIS Resources\SelfSSL>

 

Problem 2: Connection Refused

Unfortunately, the next attempt at testing the utility also failed. This time the Application Event Log on the Index server had an entry 1314, from ASP.NET. The web service call was not even getting to SharePoint. It was being refused by the ASP.NET handler in IIS. (I highly recommend reading Inside SharePoint Enterprise Project Management with SharePoint. This article goes into great detail on how SharePoint web service authentication/authorization works.)

cdwp1

There is a wealth of information in the event description.  Notice the message says “URL authorization failed”.  We can also see the requested URL was “/SharedServices1/Search/SearchAdmin.asmx”, and the requesting user identity was “LITWAREINC\Administrator”.

But wait, the requesting user was a farm administrator. How could a farm administrator be denied access to a SharePoint URL? This sounds like a web.config issue, not a SharePoint issue.

Event code: 4007

Event message: URL authorization failed for the request.

Event time: 4/11/2009 4:28:24 PM

Event time (UTC): 4/11/2009 9:28:24 PM

Event ID: 418e2b58d47e4e0e81c213f24f64d642

Event sequence: 2

Event occurrence: 1

Event detail code: 0

 

Application information:

    Application domain: /LM/W3SVC/1720207907/root/SharedServices1-1-128839589029265968

    Trust level: Full

    Application Virtual Path: /SharedServices1

    Application Path: C:\Program Files\Microsoft Office Servers\12.0\WebServices\Shared\

    Machine name: MOSS

 

Process information:

    Process ID: 4448

    Process name: w3wp.exe

    Account name: LITWAREINC\sspservice

 

Request information:

    Request URL: https://moss:56738/SharedServices1/Search/SearchAdmin.asmx

    Request path: /SharedServices1/Search/SearchAdmin.asmx

    User host address: 192.168.150.2

    User: LITWAREINC\Administrator

    Is authenticated: True

    Authentication Type: Negotiate

    Thread account name: LITWAREINC\sspservice

 

The web.config file on the Index server is listed below. Unimportant sections have been removed for brevity. The key lines are the authorizations.

The first authorization, for the root level, looks fine. WSS_ADMIN_WPG membership includes all farm administrators including the requesting account “LITWAREINC\Administrator”, so this is not the problem.

cdwp1

The second authorization for the specific location “SharedServices1” looks more interesting. This authorization only lists 2 accounts and no groups. Neither account is “LITWAREINC\Administrator”, which is why ASP.NET denied access to the web service call.

<?xml version="1.0" encoding="utf-8"?>

<configuration>

    <configSections>

            . . .

    </configSections>

    <system.web>

        <authorization>

            <allow roles=".\WSS_ADMIN_WPG" />

            <deny users="*" />

        </authorization>

        <webServices>

            . . .

        </webServices>

    </system.web>

    <location path="SharedServices1"  inheritInChildApplications="true">

        <microsoft.office.server>

            . . .

        </microsoft.office.server>

        <system.web>

            <authorization>

                <allow users="litwareinc\SPAppPool,litwareinc\SPFarmAdmin" />

            </authorization>

        </system.web>

    </location>

</configuration>

So now the question is where do these account names come from? A little digging through documentation and blogs revealed the answer. These are the application pool accounts for SharePoint web application sites. Everytime you create a new application pool through Central Administration when creating or extending a web application, the application pool identity is added to this list. In this farm there are 2 application pool identities, SPAppPool (for non-administrative application pools) and SPFarmAdmin (for Central Admin and SSP application pools). It is important to notice that since litwareinc\administrator is not used as an application pool identity, this account does not appear in the authorized users list.

Now that we know what the problem is, how do we get litwareinc\administrator into the authorized list? We cannot directly edit web.config. Other than supportability questions, SharePoint automatically rewrites this list every minute. Even it we manually changed web.config, SharePoint would remove our change.

Since the utility had to run as litwareinc\administrator, the only way to make this happen was to create a dummy web application through Central Administration, specifying to created a new application pool with the identity litwareinc\administrator. Since we are not really using this application pool, we can stop the dummy web application and the application pool to minimize the performance impact. After this, SharePoint added litwareinc\administrator to the web.config authorized list.

The utility now works perfectly from any server in the farm. The takeaway of this story is if you are writing code that has to call the Office Server Web Service, impersonate a SharePoint application pool account when making the web service call, realizing some object model method calls are converted under-the-covers to web service calls depending upon which server in the farm your code is running on.

Content Deployment and CEWP Absolute URLs

Content Deployment and CEWP Absolute URLs

The Content Editor Web Part (CEWP) has a Rich Text Editor. This allows non-technical authors to generate custom content using a web part. This is a great feature for team collaboration sites, but some customers also use the CEWP on publishing site pages. Avoiding the discussion of web parts verses field controls in published pages, there are issues using the CEWP to create content on published pages.

cdwp1

The Problem

A Rich Text Editor sounds like a great feature; however, there is a problem with content deployment, described by Andrew Connell. (There is a related problem for sites that can be accessed through multiple AAMs as described by Maxime Bombardier.) The basic problem is the Rich Text Editor forces all URLs to be absolute. If you look at the HTML generated by the above HTML editor, you will see:

 cewp2

As Andrew Connell points out:

If you have a link to http://staging.adventureworkstravel.com/pages/contactus.aspx in a CEWP on a page and then do content deployment to http://www.adventureworkstravel.com, the link will be pointing back to the staging site (which will... or should... be inaccessible).

The absolute URL is not fixed up automatically during content deployment, so the target page will still point to the original URL location, not a location within the target farm. This means the absolute URL must be corrected in the target farm itself. In the preceding example, we want the target farm HTML to be a relative URL that points to locations within the target farm:

cewp3 

Maxime Bombardier’s control adapter strategy can be leveraged. With slight modification, Maxime’s code can be modified to convert absolute URLs to relative URLs, so they effectively point to the appropriate location in the target farm. How do we convert an absolute URL to a relative URL? Looking at the above example, we need to strip out the host portion of the URL; that is, we need to remove “http://moss.litwareinc.com”

The Solution

The difference between the content deployment fix we need, and Maxime’s AAM fix, is what gets stripped. Maxime’s AAM fix strips the AAM host names of the current web application. The content deployment fix needs to strip the AAM host names of the authoring web application, which the target farm does not know. In other words, we need a way to tell the control adapter which host names to strip. The solution is to make the host names configurable. This solution uses the SharePoint Config Store solution developed by Chris O'Brien.

cewp4 

So, the primary difference between this control adapter and Maxime’s control adapter is the GetAlternativeUrls method. This method’s logic is changed to read AAMs from the Config store list, rather than using the object module to get the AAMs from the current web application.

The Code

using COB.SharePoint.Utilities;
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using System.Web;
using System.Web.UI;
using System.Web.UI.Adapters;
using Microsoft.SharePoint;
using Microsoft.SharePoint.Administration;  

namespace Litware.SharePoint.WebPartPages.CewpControlAdapter

    public class ContentEditorWebPartAdapter : ControlAdapter 
   
        protected override void Render(System.Web.UI.HtmlTextWriter writer)
       
{
            StringBuilder sb = new StringBuilder(); 
  
            // Allow the CEWP to render itself into a string that we provide 
            HtmlTextWriter htw = new HtmlTextWriter(new StringWriter(sb));

           
base.Render(htw); 
            string output = sb.ToString(); 

            // Wrap the adatper rendering logic in a try-catch so any error 
            // in the adapter won't prevent the CEWP from rendering. 
            try 
           
                // Now we post-process the CEWP rendering to convert absolute URLs to relative URLs 
                string[] alternativeUrls = GetAlternativeUrls(); 
 
                if (alternativeUrls != null
               
                    foreach (string replaceableUrl in alternativeUrls) 
                   
                        // Do a simple String.Replace() of the alternativeUrls to generate a relative url 
                        searchFor = replaceableUrl; 
                        replaceWith = "/"
                        output = output.Replace(searchFor, replaceWith); 
                   
               
           
            catch (Exception ex) 
           
                // log exception here 
           

            // Finally, write the rendering to the page 
            writer.Write(output); 
        
}

        private string[] GetAlternativeUrls()
        {
            string[] alternativeUrls = null

            try 
           
                alernativeUrls = (string[])HttpContext.Current.Cache["alternativeUrls"]; 
 
                if (alternativeUrls == null
               
                    string temp = String.Empty; 
 
                    // Get the URLs to be replaced from the config store 
                    temp = ConfigStore.GetValue("CEWP Adapter", "AAMs"); 

                   
if (String.IsNullOrEmpty(temp)) 
                        throw new ArgumentNullException("AAMs config store parameter is null or empty"); 
 
                    // Split apart the config parameter using a semicolon separater value 
                    char[] separaters = { ';' }; 
                    alternativeUrls = temp.Split(separaters); 
 
                    // Validate the config value 
                    if (alternativeUrls == null || alternativeUrls.Length == 0) 
                        throw new ArgumentNullException(
                            "AAMs config store parameter is null or empty after split"); 
 
                    for (int i = 0; i < alternativeUrls.Length; i++) 
                   
                        // Ensure the URL is "/" terminated for consitent replacement behavior 
                        string replaceableUrl = alternativeUrls[i]; 
                        if (!string.IsNullOrEmpty(replaceableUrl) && !replaceableUrl.EndsWith("/")) 
                            alternativeUrls[i] += "/"
                   
 
                    // Sort, and then reverse the array 
                    // to put the longest; that is, the most specific 
                    // URLs first in the list 
                    Array.Sort(alternativeUrls); 
                    Array.Reverse(alternativeUrls); 
 
                    // Cache for 5 minutes to allow for a somewhat quick refresh
                    // if the configuration values are changed 
                    HttpContext.Current.Cache.Add("alternativeUrls",alternativeUrls, null,
                         DateTime.Now.AddMinutes(5), System.Web.Caching.Cache.NoSlidingExpiration,
                         System.Web.Caching.CacheItemPriority.Normal, null); 
               
           
            catch (Exception ex) 
           
                // log exception here 
           
 
            return alternativeUrls;
        }
   
}
 

Build the output assembly as strongly named, and then deploy it to the GAC on every WFE. A simple solution package (wsp file) can be created to automate the deployment to all WFEs. Creating a solution package is automatic if you create your project using the Visual Studio 2008 extensions for Windows SharePoint Services 3.0, v1.2.

Browser File

The control adapter is associated with the CEWP through browser file entries. The default browser file, compat.browser, is in the App_Browsers folder of each web application’s virtual directory. We don’t want to update this file; instead, we will create a separate file for our control adapter.

Additional browser files can be added to the same directory; however, additional files will not be recognized by ASP.NET until compat.browser is recompiled. Recompilation is forced by opening compat.browser in a text editor like Notepad, making an innocuous change (e.g., add and then delete a space), and then saving the file.

The customized browser file, litware.browser, is created and placed in the App_Browsers folder of every web application where the control adapter is to be enabled. Without this browser file, the control adapter is not called even though it may be installed in the GAC. The default compat.browser is then updated and saved to force a recompiled of the local application browser files as described in the preceding paragraph.

cewp5 

Looking at the litware.browser code, the entry controlType attribute tells ASP.NET to associate this control adaptor with the CEWP. The adapterType attribute has the control adapter type and assembly. The refID=”Default” attribute tells ASP.NET to apply the adapter to all browser types.

<?xml version="1.0" encoding="utf-8" ?>
<browsers>
  <browser refID="Default">
    <controlAdapters>
      <adapter controlType="Microsoft.SharePoint.WebPartPages.ContentEditorWebPart"
               adapterType="Litware.SharePoint.WebPartPages.CewpControlAdapter.ContentEditorWebPartAdapter,
                    Litware.SharePoint.WebPartPages.CewpControlAdapter, Version=1.0.0.0, Culture=neutral,
                    PublicKeyToken=daf6fd1bbe0cfc20
" />
    </controlAdapters>
  </browser>
</browsers>

Note, App_Browser directory must be updated on every WFE in the target farm.

References

Browser Definition File Schema (browsers Element)
Securing Browser Definition Files

Posted by jimmiet | 1 Comments

Why Bring Down the Entire Farm for Patches?

Why Bring Down the Entire Farm for Patches?

Many customers are surprised to learn they have to bring down their SharePoint farm just to apply a hotfix or service pack, and express frustration when told there is no other option. Why is it necessary to stop the farm?

Think of a SharePoint farm as a living organism. It has a nervous system and a heart. The farm isn't viable if either of these components is interrupted.

The nervous system is a set of web services. All the servers in the farm are constantly chatting amongst themselves. What is the topic of the chatting? It is the farm health. The servers are constantly asking each other, "are you there and healthy?", or "do your registry settings match the configuration database, or has some human manually tried to adjust you?", or "can you respond to a search query?" These web services implement a self-healing, self-growing capability. For example, when you add a new web front end (WFE) to the farm, you don't have to manually add your custom features and solutions. The farm's nervous system recognizes that a new server has joined the farm organism, and automatically configures that WFE so it can function correctly in the farm community. If any server fails to respond appropriately to the web service nervous system conversations, because it does or does not have the same patches applied as the other servers in the farm, the farm organism could be thrown into disarray.

The heart of the farm is the configuration database, supported by the SSP databases and content databases. These databases maintain the farm state (configuration, customizations, pages, user profiles, content, etc.). The schema of these databases is tightly coupled to the configuration and content. Many hotfixes and patches must modify the database schema(s) to provide corrections and improvements. Because the databases are the heart of the farm, schema changes (open heart surgery) can only be performed by putting the patient to sleep when the changes are applied. Unfortunately, if the schema changes have to modify tables in the content databases, the result may be a long surgical procedure while thousands or even millions of rows of content data are updated.

Agreed, putting a farm to sleep for surgery is not ideal, but is necessary.

Posted by jimmiet | 1 Comments

Office Web Service

Office Web Service

The Office Services Web service is used by Office SharePoint Server 2007 as a communication channel between Web servers and application servers. This service uses the following ports:

  • TCP         56737
  • TCP/SSL     56738

Access to the web service methods is restricted to the farm administrator group, WSS_ADMIN_WPG. None of the web service methods can be called from user code.

Web Services

Depending on features installed, the Office Server Web Services Web application exposes the following internal Web services, which are not available for calls from custom code:

Friendly Name

Location

Description

Search Web Service

SearchAdmin.asmx

Microsoft Office SharePoint Server 2007 Search Administration Web Service.

Search Application Web Service

/SSP/Search/SearchAdmin.asmx

Microsoft Office SharePoint Server 2007 Search Application Administration Web Service.

Excel Service Soap

/SSP/ExcelCalculationServer/ExcelService.asmx

Microsoft Office SharePoint Server 2007 Excel Services Application Web Service.

   

The object model automatically short circuits the web services, i.e. invokes the underlying functionality without invoking the web service, when the target server is also the client primarily for performance reasons. Hence, the web services are not used...

  • On a Basic deployment.
  • When the administrative action is performed on a WFE that also happens to be the indexer.

Global Web Service (SearchWebService)

Runs in the Office Server Web Services virtual server root application pool, i.e. an application pool that does not belong to any SSP. This GLOBAL application pool runs as NetworkService.

It is used to retrieve low level computer configuration settings before any SSP is created, e.g. system drive info, verify path correctness, the computer's IP Address.

It is also used to create/configure a propagation share. The web method that implements this functionality is special: It impersonates the WindowsIdentity making the request. That identity must be a local admin on the remote server (only local administrators can create/configure shares).

Allowed access: WSS_ADMIN_WPG.

 SSP (Application) Web Service (SearchApplicationWebService)

Primarily used for SSP administration of Search configuration.

A web service associated with a specific SSP on a specific server (indexer and/or query server).

Runs as the SSP web service credentials (the credentials that you enter in the SSP creation/details page).

The SSP web service account can read/write from/to the SSP database and the Search database (only the ones that belong to its SSP).

Allowed access: WSS_ADMIN_WPG and the SSP administration application pool identity.

Security

InterServer Communications

Network traffic can be secured with either SSL on port 56738, or with IPSec on either port.

IPSec is an IP level feature, which means all traffic on the configured ports is protected; whereas, SSL is an application level protection mechanism.

IPSec has the advantage of limiting which pairs of servers can communicate, by configuring the IP addresses. This feature can significantly lock down a server farm.

Service Accounts Used

Search service account

  • It is a db_owner in ALL SSP databases.
  • It is a db_owner in ALL Search databases.
  • It has READ ONLY access to all the content in ALL web applications via a policy.
  • It has read/write access to the propagation share on Query servers.
  • It has read/write access to the Search registry hive.
  • It has read/write access to the Search index location.

SSP administration site application pool identity

  • This account is determined by the web application that you select when you create the SSP.
  • It has read/write access to the SSP database and the Search database.
  • This account has full control over the Search service via its COM interfaces.
  • It has read/write access to the Search registry hive.

Global web service account

  • This is the GLOBAL application pool account of the Office Server Web Services, i.e. an application pool that does not belong to any SSP.
  • It is always set to NetworkService.

SSP (Application) web service

  • The application pool account of an SSP web service (the credentials entered in the SSP creation/details page).
  • This account has read/write access to the SSP database and to the Search database of an SSP.
  • This account has full control over the Search service via its COM interfaces.

It has read/write access to the Search registry hive.

Posted by jimmiet | 1 Comments
Filed under:

Information Management Policies – Expiration

Information Management Policies – Expiration

The question is, when an information management expiration policy is defined, is the expiration period applied immediately or at some future time? If the answer is, at some future time, exactly what is that time, and can you set it?

Contrived Example

To take a concrete example, create a document library.

Next, modify the default view allow us to see what is happening. Go to Settings, Views, and click on All Documents. Add the columns Created, Exempt from Policy, and Expiration Date to the view. Press the OK button to save the view changes.

Now Go to Settings, Document Library Settings, and click on Information management policy settings.

  • Select Define a policy and press the OK button.
  • Check Enable Expiration
  • Under retention period, select A time period based on the item's properties.
  • Select Created + 30 days.
  • Under When the item expires, select Perform this action, and Delete.
  • Press the OK button.

Return to the library, All Documents view. Upload several documents. The uploaded documents will have an Expiration Date of the current date plus 30 days as expected.

Go back to the Settings, and change the retention period to Created + 60 days. Upload some more documents. The newly uploaded documents correctly show an expiration date of the current date plus 60 days, but the previously uploaded documents still have an expiration date of the current date plus 30 days. It appears there is an inconsistency.

Information Policy Jobs

The key to this inconsistency is the Information Management Policy timer job. This job runs once daily by default. It iterates all the web applications/site collections/sites/lists in the farm, looking for information policy changes. When a policy change is found, all affected item's metadata is updated; consequently, the expiration dates of the library documents in our contrived example are not updated until this job runs. When this job eventually runs, the inconsistency will be corrected.

There is an stsadm command to change the schedule of this job, SetPolicySchedule.

stsadm -o setpolicyschedule   -schedule <recurrence string>

Parameter 

Value 

Required? 

Description 

schedule

A valid Windows SharePoint Services Timer service (SPTimer) schedule in the form of any one of the following schedules:

  • "Every 5 minutes between 0 and 59"
  • "Hourly between 0 and 59"
  • "Daily at 15:00:00"
  • "Weekly between Fri 22:00:00 and Sun 06:00:00"
  • "Monthly at 15 15:00:00"
  • "Yearly at Jan 1 15:00:00"

An acceptable default value is "once every 24 hours." 

Yes 

Sets how often the policy framework processes changes to a policy. The value should be a properly formatted SPTimer argument. 

Since this job could affect performance in a large farm, be careful when scheduling it. Daily is probably sufficient. It is also a good idea to have it run about an hour before Expiration Policy job, so the Expiration Policy job will find up-to-date item metadata when it applies the expiration policy to items.

You can find current information on both the Information Management Policy and Expiration Policy jobs by going to Central Administration > Operations > Timer Job Definitions. This will show you the frequency and last run time of each job, but not the complete schedule.

There is currently no stsadm command to change the Expiration Policy job; however, Mattias Lindberg has a code sample to for the job to run using the object model.

Summary

  1. Item metadata is not updated for policy changes until the Information Management Policy job executes.
  2. The Information Management Policy job schedule can be set with stsadm –o setpolicyschedule.
  3. The Information Management Policy job should be scheduled to execute shortly before the Expiration Policy job.
  4. The retention policy is not applied to items until the Expiration Policy job executes.
  5. The Expiration Policy job currently can only be scheduled through the object model.
Posted by jimmiet | 2 Comments

My Site Recommendations

My Site Recommendations

The following My Site recommendations are a composite of best practices taken from experiences at Microsoft and other large customers.

Planning

My Sites (even if they are as small as possible and only really used to store a profile picture) complicate backup/recovery, and add complexity and risk to ensuring the availability of the rest of the SSP farm.  Large organizations (100K+ employees) should consider putting My Sites into a separate farm.

Very large organizations might consider multiple My Site farms, perhaps regionally located. This minimizes the number of content databases per farm and places the My Sites geographically closer to the site owners. Having fewer content databases per farm eases administrative burdens. Having My Sites hosted closer to site owners reduces the affects of network latency, thereby enhancing their usage experience. Multiple My Site farms can also provide more flexibility in managing the effort and time required to deploy updates and service packs to any given farm.

Always create a dedicated Web Application to host My Sites. This allows leveraging web application policies to define security, facilitates content database management, and enables creation of zones for external access.

Do not customize the My Site site definition. Besides being unsupported, poorly designed customizations can severely impact server performance and unnecessarily consume valuable CPU and memory resources. Any customization should be done through feature stapling, see http://blogs.msdn.com/sharepoint/archive/2007/03/22/customizing-moss-2007-my-sites-within-the-enterprise.aspx and http://blogs.msdn.com/cliffgreen/archive/2008/03/13/removing-web-parts-from-the-my-site-web-part-gallery.aspx

Indexing

Install the latest service pack (currently SP1). Be sure to get the latest post service pack hotfixes applied, in particular 21243 (Office QFE). There is an issue where incremental crawls will not pick up all changes on My Sites without the post SP1 hotfix.

Use a separate content source for People Profiles rather than allowing it to default to the Local Office SharePoint Server Sites content source. My Sites full crawls can be time consuming due to the large number of site collections. Creating a separate content source enables independent crawl configuration; such as, the type of crawl and crawl frequency for My Sites.

Provisioning

Remember, a user profile page will exist for all employees following a full Active Directory profile import, even if no My Sites have been created yet. Profile pages allow basic employee information to be exposed in search results.

Allow users to create their own My Site on demand. Do not pre-provision My Sites. Generally, pre-provisioning is a time consuming process potentially taking many days or weeks. It gains little, and costs storage and administrative headaches.

Rolling-out

Make My Sites available to everyone on day 1. This allows for "viral" adoption by early adopters. This will eventually encourage others to create My Sites, thereby getting the momentum rolling.

Send out invitations to a small group of "pilot" users who would be interested in trying out My Sites, based on their role in the organization. The pilot group might contain a few hundred users. This gets a critical mass of My Sites in place quickly.

Try regional roll-outs via "soft launches", which include poster campaigns or brown bag lunch training at selected campuses and offices.

About the 3rd or 4th month, promote the My Site feature in a story on the Intranet portal home page.

Around the 6th month, incorporate the concept of "filling out your profile" into new employee orientation as a specific training exercise. Now essentially all new hires will have a My Site (because they need one to store their profile picture).

Encouraging On-Going Adoption

Encourage high profile "executive blogging" to drive awareness and adoption of My Sites. Blogging topics might include annual business planning, corporate strategy, rumor control, etc. This can demystify blogging by encouraging many participants to make daily posts about what they were doing and what they are thinking about. Note: this implies more frequent incremental crawling to incorporate blog entries into the search index.

Also, consider setting up a "Blogs" scope on the search home page to facilitate blog discovery. This can be accomplished by setting up scope based on the Content Type of blog posts. An example follows:

Note that this scope will pick up Blog Posts no matter where the Blog resides, as long as those a SharePoint Blogs crawled by a SharePoint Content Source.

Consider adding a link on the profile page that allows others to send an email to ask the person to fill out their profile ("peer nagging").

Explore holding a contest or raffle – If you fill out (or update) your profile this month you are eligible to win a prize.

Posted by jimmiet | 1 Comments
Filed under: ,

Vanity Site Collection URLs

Vanity Site Collection URLs

A customer recently asked for “vanity” URLs for each of the major departments; HR, Finance, Legal, etc.; so for example, HR would be http://hr.bigcorp, and legal would be http://legal.bigcorp.  No problem you say, just create a web application for each department, give the web application default zone the vanity URL, create corresponding DNS entries, and the requirement is fulfilled.

The problem is there are over 30 departments, not to mention foreign subsidiaries, and possibly other “I want my own vanity URL” requests from other corporate groups. Since web applications are heavy resource consumers, each one requiring one IIS web site per zone, basing vanity URLS on web applications would not be feasible.

The traditional out-of-the-box site collection paths did not meet the requirements;

Wildcard inclusion; e.g., http://bigcorp/sites/finance

Explicit inclusion: e.g., http://bigcorp/finance

For both wildcard inclusion and explicit inclusion, the vanity part of the URL is at the end, which is not what was desired.

Host-Named Site Collections

Host-named site collections (what used to be called "scalable hosting mode" in WSS v2) provide exactly the needed capability. Don’t confuse the terms host-named with host headers. They are different concepts. The host-named concept applies to the internal SharePoint virtual path mapping mechanism; whereas host headers apply to IIS web sites independent of SharePoint.

Host-named site collections effectively allow use of an arbitrary URL, which is associated with an existing web application. There can be many host-named sites for a web application.  The net result is freedom to have as many vanity URLs as necessary, while limiting the number of web applications. We can have URLs like http://finance.bigcorp, http://finance, http://my.finance, etc.

Host-named sites cannot be created through the UI. You must use stsadm. This should not deter you, since the syntax is simple. The secret is to add the –hhurl parameter to the stsadm createsite command.

Host-Named Site Collections Limitations

As with all good things, there are some limitations and complications. The following blog is an excellent read: http://blogs.msdn.com/sharepoint/archive/2007/03/06/what-every-sharepoint-administrator-needs-to-know-about-alternate-access-mappings-part-1.aspx. Quoting from this blog:

Host-named site collections short circuit much of the AAM functionality, including the URL remapping functionality.  They're always considered to be in the Default zone, and their URL is always the same URL you supplied when creating the site collection.  You cannot use host-named site collections with off-box SSL termination, port translation, or host header manipulation scenarios.

This whitepaper is also highly recommended reading. It discusses how to enable SSL, and many other configuration issues:

http://office.microsoft.com/search/redir.aspx?AssetID=AM102157711033

Step-by-Step Example

Assume we want to create a site collection with the vanity URL http://finance.litwareinc.com in the http://extranet.litwareinc.com web application.

Step 1: Open a command prompt as a farm administrator, and then enter this command, being sure to include the –hhurl parameter:

>stsadm -o createsite -url http://finance.litwareinc.com -ownerlogin litwareinc\administrator -owneremail administrator@litwareinc.com -hhurl http://extranet.litwareinc.com -sitetemplate STS#1 -title "Finance Department" -quota DepartmentalSite

 

Operation completed successfully.

The new site collection appears in the site collection list of the extranet.litwareinc.com web application.

 

Step 2: Create a DNS entry for this new name, pointing to the http://extranet.litwareinc.com IP address. For local test, make an entry the hosts file.

You can now open a browser and navigate to the site collection.

Step 3: Create a search context for the new site collection, or include it as a starting address within an existing site collection. Start a crawl to include the site contents in the search index.

To test the search, create a simple text document in a shared document library. When the next increment crawl completes, you can then search for the document to verify the search results are using the vanity URL.

The Kerberos Factor

The stsadm site creation command will give a warning if the host web application is using Kerberos authentication (negotiate).

>stsadm -o createsite -url http://finance.litwareinc.com –ownerlogin mossfs\administrator -owneremail administrator

@mossfs.com -hhurl http://dmz.litwareinc.com -sitetemplate STS#1 -title "Finance Department"

 

WARNING: SharePoint no longer customizes Integrated Authentication security settings. This Web application may be using Kerberos, which can require manual configuration. See http://support.microsoft.c

om/?id=832769 for more information.

 

Operation completed successfully.

It is necessary to register the vanity URL with Active Directory using setspn.

>setspn -a http/finance.litwareinc.com litwareinc\administrator

In my testing, I stumbled into another complication. Without realizing it at first, I created a site collection in the Intranet zone of a web application, which was configured for Kerberos.

After creating the site collection, adding the DNS (host file) entry, and executing setspn, I got a “This site is under construction” error page every time I tried to browse to the site. I eventually worked-around this by explicitly adding a host header to the IIS web site. I consider this an unsupported solution, and so I recommend extensive testing before applying it to a production scenario.

Posted by jimmiet | 1 Comments

“Hidden” SSP Timer Jobs

Not all timer jobs are visible in the Central Administration timer job definition page. There are MOSS 2007 timer jobs in the SSP application which don’t appear in the Central Administration page. It makes sense that these jobs are not visible, since there is nothing you can modify or disable. All the same, it would be nice to know what these jobs are, and what their schedules are, in case you want to schedule other potentially conflicting activities.

You can see the names of these SSP jobs by using stsadm enumssptimerjobs command, as in the following console sample (view entire article ...)

 

Posted by jimmiet | 0 Comments

How are SSP Associations Changed?

I recently heard a question, “How do you change SSP web application associations through the object model (OM)?”  Searching the OM online help didn’t find any results. Any yet theoretically, it must be possible, since you can set the web application SSP association through Central Administration.

Then the thought occurred, why not reverse engineer the Central Administration page to see how it is done? Here are the steps. This technique can be applied to any administration page if you are curious how the product team wrote the code. (view entire article...)

Posted by jimmiet | 0 Comments
Filed under:

Deleting User Profiles

The out-of-the-box UI provides a means to manually remove user profiles. Navigate to SSP Admin > User Profile and Properties > View User Profiles. Search for the user’s profile, and then click the “Delete” context menu item or the Delete toolbar button.

This is fine for an occasional profile deletion; but what if you need to delete thousands of profiles? We recently ran into this situation. (View entire article ...)

Posted by jimmiet | 0 Comments
Filed under:

DST and Timer Jobs

Now that daylight savings has arrived in the United States, have you noticed problems with timer jobs not running when expected?

I recently encountered this trying to deploy solution packages using stsadm scripts. In the past, these scripts ran flawlessly. The old solutions were retracted and deleted, the new solution versions added and deployed across all farm servers within a few minutes, and life was good.

Last week I was working with a customer to deploy a new staging farm. We got to the point at which the solution deployment scripts were run. We waited, and waited, and waited. The script was hung at stsadm –o execadmsvcjobs, which was called after executing several stsadm –o deploysolution -name solutionpackage1.wsp -immediate -force –allowGacDeployment statements. What should have happened is the timer jobs to deploy the solution across the farm servers should have executed within about a minute. Instead, over an hour passed. Then suddenly, the jobs ran. What caused this odd behavior, which had never happened before with the same scripts? Was there a farm configuration problem?

Then a thought occurred. The time had changed to daylight savings a week before; moving the clocks ahead one hour. Could there be a connection?

A little research brought the problem to light. WSS 3.0 SP1 includes fixes to timer job DST scheduling problems. See 938663 ( http://support.microsoft.com/kb/938663/ ) One-time timer jobs in Windows SharePoint Services 3.0 are delayed by at least one hour when the jobs are scheduled to occur during daylight saving time (DST).

A quick check in Central Administration > Operations > Servers in Farm, showed the installed version number was 12.0.0.4518. It should have been 12.0.0.6219 if SP1 was installed. We installed WSS 3.0 SP1, and the timer job scheduling problem disappeared!

I strongly encourage installation of WSS/MOSS SP1. It fixes many issues, DST just being one example. Taken time to read and following the installation instructions precisely! Remember, you have to update all servers in the farm at once. There is no rolling update, so if you have a multiple server farm with large content databases, plan on doing this over a weekend to avoid disrupting your user community.

Posted by jimmiet | 1 Comments
Filed under: , ,

User Profile Change Logging

 

Profile Change Logging

There is a table in the SSP database named UserProfileEventLog. The table maintains a history of user profile property changes. By default, it retains 7 days of history. This table can cause problems in a couple of ways.

First issue: size. This table contains one row per change of a property in a user profile. The row contains the old and new property values, along with associated metadata like the datetime the property was changed. The implication is that the table can grow large. Assume you just configured the profile import and are ready to start the first Active Directory import. Further, assume your organization has 100,000 user accounts, and each account be populated with 12 AD attributes.  The full import will result in 100,000 X 12 = 1.2 millions rows being inserted into UserProfileEventLog.  To extend the example, assume you also have a BDC import connection populating another 20 properties from your company's HR system. That adds another 2 million rows. There are now 3.2 million rows in the event log table.  If each row is 100 bytes (old value, new value, plus metadata), the table is now approximately 300 MB in size.

Second issue: deleting old entries. The change history is kept for a configurable number of days. The concern is how many rows will have to be deleted on a given day? The number of days of history is 7 days be default, but can be set using stsadm.exe -o profilechangelog -title <SSP Name> -daysofhistory <number of days> -generateanniversaries (http://technet.microsoft.com/en-us/library/cc263013.aspx).  The critical issue is that MOSS has to remove a full day of history every day to honor the daysofhistory setting. Using the numbers in the preceding paragraph, 7 days after the first full import, MOSS is going to delete 3.2 millions rows of data all at once!  What makes this an issue is that this is done with a single SQL statement, something like DELETE FROM  UserProfileEventLog WHERE EventId < @MinEventTime. Think of the implications. As a single statement, this will hold locks until all 3.2 million rows are deleted. These locks might prevent other database transactions from completing; but it also means the transaction log (even with simple recovery mode) cannot be truncated, and will therefore grow until at least this DELETE statement completes. I have seen this delete statement run for 4 hours, with the transaction log quadrupling in size.

This is little you can do to avoid this. Although you can adjust the number of days of history, MOSS will eventually try to delete an entire day's history at some point. The deletion is done by an internal timer job buried within the SSP. The job is hard coded to run at 10:00 PM daily. You cannot change this scheduled time.

What are the take-aways?

  • 1. Be prepared for an occasional long running DELETE statement. Ensure you have sufficient transaction log space to accommodate the potential transaction log growth.
  • 2. Don't schedule any other timer jobs or database maintenance for 10:00 PM to minimize possible deadlocks and transaction timeouts.
  • 3. Be careful using stsadm -o profilechangelog to reduce the number of days of history, because the next time 10:00 PM comes, MOSS will try to delete multiple days of history all at once.
More Posts Next page »
 
Page view tracker