Welcome to MSDN Blogs Sign in | Join | Help

office 2007 files in sharepoint v2 (the lurking danger and how to avoid it)

first off, if you're using office 2007 file inside sharepoint v2 you need to read this post. there are some very important things you need to be aware of.

as some background, office 2007 introduced a new xml based file format for office documents. the  new .docx, .xlsx and .pptx files are actually all zip files containing several xml files which makeup the document and its styling.

want to see for yourself?

try renaming one of those new files types with a .zip extension and you'll be able to look inside the file or extract it and examine all the xml documents. you'll see something like this:

image 

getting back to sharepoint, microsoft released a kb article 939909 with details around the support for office 2007 documents in sharepoint v2... but essentially it says they aren't supported. if you're using them, you need to be aware of the potential issue described in this post.

sharepoint automatically stays in sync with office files and the metadata stored inside of them. when a change happens in either sharepoint or the file itself, a synchronization between the sources will occur. there are two terms used to describe this action:

  • demotion - term used to describe the act of sharepoint writing metadata into a document
  • promotion - term used to describe the act of a document writing metadata into sharepoint

you can learn a bit more about this whole promtion / demotion thing on msdn

there is a metadata issue with this that affects both windows sharepoint services v2 and sharepoint portal server 2003. since sharepoint v2 is not aware of the office 2007 file format, it does not demote sharepoint metadata into the files, as it does with previous office formats.

so why does this matter to me?

scenario time:

let's say you have a sharepoint v2 document library with some required custom metadata fields, say  "customer" and "business unit". when a new office 2007 document is created and saved back to sharepoint, a dialog will prompt the user to enter any required fields.

when entered, metadata will be saved back to sharepoint and displayed correctly in the library but the metadata is not demoted into the document.

so what's the implications of that?

well when sharepoint and a file are out of sync it can cause unexpected results and loss of metadata. a prime scenario for this is upgrading to sharepoint v3... metadata can and will be lost in that scenario.

the dangerous part is that metadata will have appeared to have been upgraded successfully, and will appear in sharepoint document libraries... however as soon as a document is modified in any way (as simple as edit > ok) metadata will be reset to defaults, effectively losing all values. that's why if you aren't aware of this issue, it can be quite a confusing and potentially quite bad.

safely upgrading from sharepoint v2 to sharepoint v3 with office 2007 files

if you have office 2007 files in sharepoint v2, you will need to upgrade from v2 > v3 in a certain order and run a tool after the upgrade to fix the metadata relationships with their files.

the upgrade process will be:

  1. run prescan on your v2 database(s) and fix any issues identified.
  2. disable any sybari / forefront scanning .
  3. attach your v2 database(s) to your v3 farm, upgrading them.
  4. after upgrading, run the metadata_refresher tool below to sync any office 2007 documents with their metadata.

    note: modifying documents and/or their metadata in any way before running the refresh tool will cause metadata loss. the metadata will be reset to default values, so ensure this tool (or something like it) is run immediately after upgrades to v3 have completed.

metadata refresher tool

the source code attached to this post contains an application which can correct the metadata sync issue. two things to note:

  1. only the most recent version of files will be corrected.
  2. any documents which are linked copies will be skipped to avoid breaking the relationship but these can be located in "scan mode".

see the readme.txt file for details on usage.

disclaimer: while this code has been tested and proven to work, anyone considering its use should test and validate in their own test environments before attempting to use on any type of production server/data. the code is provided as is, with no warranty of any kind and confers no rights.

Posted by skelley | 0 Comments

Attachment(s): metadata_refresher.zip

codeplex + copytimer = sptoolbox

it's been a while since my last post... i've been busy working on some creating a sharepoint metrics repository for our hosted sharepoint customers. i'll talk about that a bit more in a future post. ;)

today i'd like to announce the release of an internal tool for measuring sharepoint performance called copytimer.

copytimer will be one of many releases tracked in a new site for sharepoint tools and utilities at:

http://www.codeplex.com/sptoolbox

this codeplex site will contain various tools for sharepoint developed by my team (msit / sharepoint online services group), the sharepoint product group, as well as sharepoint rangers and mvps.

the first item added to the toolbox is called copytimer. this is a standalone application which measures sharepoint performance by timing the downloads/uploads of files to a sharepoint site. it then records how long the operations took along with a bunch of information about the client system like latency, ethernet adapter, ip address, etc...

the best part about the tool is it can be run from any windows client which has access to the sharepoint site being tested, as it uses the webdav protocol for file transfers... nothing is object model. special thanks to sean livingston for creating the original version and working out many of the tricky details for uploading to sharepoint thru webdav.

internally at microsoft, we've used copytimer for future planning and architecture design as well as monitoring and measuring the overall performance of our sharepoint environment.

you can download the latest release of copytimer here:

https://www.codeplex.com/release/projectreleases.aspx?projectname=sptoolbox&releaseid=8366

here are a few screen shots, because what good is an app without screenshots ;)

cmd

results

and here's some examples of what you can do with data from copytimer in excel:

results_excel

results_excel2

enjoy!

Posted by skelley | 5 Comments

sending email to sharepoint

sharepoint v3 has the new feature of sending email to sharepoint lists.

cool right?

there is even a built in sharepoint list template called "discussion boards" which is ideal for incoming emails, and acts as a threading tool for grouping messages together.

<soon i'll insert an image of a discussion board list>

this is handy for a lot of reasons but one potential usage of emailing to sharepoint could be archival storage of team discussion groups or Discussion Groups / Security Groups (DGs / SGs).

to setup DG archiving to sharepoint, here is a quick overview of what you'll have to do: 

  • install windows smtp server component in the control panel on your sharepoint web front ends (not installed by default)
  • next, enable incoming email in sharepoint central admin > operations > incoming email settings
    • configure settings as per your environment
  • create the sharepoint list for incoming email
    • when creating the list be sure to check the "allow incoming email" box
  • get the list's email address. it'll be something like listname@sharepointfarm.yourcompany.com
  • create a new AD contact object
    • set the SMTP address of the contact object to the above email address
  • add this contact object to the DG / SG as a member and all future emails will be distributed to sharepoint as well

pretty sweet right? this can all be automated as well with some not too complex code.

now before you start going email crazy there's a couple of concerns to consider first.

lists in sharepoint are supposed to remain under 2,000 items (being returned to the view). you'll need to consider that you'll likely hit this limitation quickly, as some email DGs can get more 2,000 mail items in only a few days (or less).

ok fine.. i'll admit it...

in sharepoint technically you can have more than 2,000 items in a list... you could even have millions. 

the key is that you need to ensure that the view you are using for displaying the list, is not returning any more than 2,000 items.

as an example, setup the default view of a list to filter for the last 7 days or something which you know will always have a result set of under 2,000 items. the rest of the items can be found by searching, which is likely what people will do anyway. whose gonna browse a list of thousands of items?

so that is one consideration. ensuring there are default views which limit the items displayed for all email enabled lists.

another issue is storage. most companies have lots of DGs and if all of a sudden they're going to sharepoint you'll run out of storage the first day!

that's why you'd need to figure out a rollout plan. slowly scale up and watch how much space is needed.

of course, this should all be tested in a lab first too, to get some ballpark estimates of how much space is required for 1 email message, how many messages sent on average per day to this DG, etc...

hey sean, what about legal ramifications?

many companies are starting to adopt (or have already) some type of retention period setting for old emails. lawyers don't want any extra documents around that could be used against the company as soon as they become inadmissible in court. i think emails minimum retention period are 1 year (unless its business critical) but i'm not a lawyer, and that depends on your organization.

the bottom line is, you're gonna want to consult your companies' lawyers and find out what their policy is. sharepoint allows you to set policies on sites to expire content after a certain period of time so that is pretty much exactly what you'll need.

now you could do this for every site collection programmatically, but what if admins of the site just go in there and turn of the policy?

so are you going to run this policy setting application nightly on a timer to always ensure all sites have the correct settings?

another whole issue to deal with.

there are plenty of things to think about, but sending email to sharepoint is a really cool feature and i imagine this will only gain popularity in the future.

in a future post (and my team's upcoming book MOSS for Architects and Engineers) i'll cover some more in depth / best practice configuration settings for exchange and sharepoint to get incoming email to sharepoint working.

-sean

Posted by skelley | 2 Comments
Filed under:

stsadm import / export missing content

in sharepoint v3, the tool stsadm has an operation for importing and exporting sites and/or webs. it is the replacement for smigrate from v2.

so say you want to move a web to a different site collection, you would need to use this tool.

unfortunately there seems to be a bug in the import / export operations.

it's a rather nasty bug too that will move most of your content but not everything, making it extremely difficult to detect.

for example you might have a document library with 20 files but only 5 of them get moved, all without any obvious notices in the log files.

the only way you would even know that it was not a complete import is if you knew the contents you were expecting to show up.


when exporting content you should use the following switches:

stsadm -o export -url <url> -filename <filename> -includeusersecurity -versions 4

when importing that content you should try using:

stsadm -o import -url <url> -filename <filename> -includeusersecurity

 

this has worked for me and my export / imports have been much more successful then without those switches, but you should still try to verify this for yourself as well.

if you're simply move a site collection to a different location you can use stsadm backup / restore which seems to be a little bit more reliable at this point.

a bug has been filed for this problem and the wheels are in motion to help fix this thing

UPDATE: apparently this is not a bug and is by design. if there are documents which are not a published major version or draft copies, they will not be backed up unless you use the -versions 4 switch. it's likely that in almost all cases you will want to use this switch to ensure all content is migrated.

Posted by skelley | 3 Comments
Filed under: , ,

enabling external rss feeds

sharepoint v3 has the nice ability to display rss feeds via web parts... but there is a potential problem

most rss feeds people care about are external, meaning outside the company's intranet.

in order to allow your sharepoint server external access you may have to add a proxy for iis so it can reach the outside internet.

to do this you must edit the web.config file for the appropriate virtual server.

warning: make a backup of your web.config before modifying

add this line toward the bottom of your web.config, but before the closing </configuration>. also you should probably do a CTRL+F and look for the word proxy in your web.config... you want to make sure if you already have settings in there that they don't conflict.

   <system.net>

      <defaultProxy>

         <proxy usesystemdefault="false" proxyaddress="http://yourproxyhere" bypassonlocal="true" />

      </defaultProxy>

   </system.net>

this should not require a restart of iis.... iis will pickup on the change automatically. try that apache. ;)

 

Posted by skelley | 0 Comments
Filed under: , ,

sharepoint terminology defined

in sharepoint there are a lot of terms that may be confusing to new sharepoint admins, or even to existing ones.

since a lot of the words have other meanings than what sharepoint uses them for, it can be a bit tricky. i know when i was starting out with sharepoint i was very confused for a while because of the ambiguity of the words

so let's take it from the top:

wss / windows sharepoint services -  this is a free product from microsoft (comes with windows server). it gives you all the basic functionality of sharepoint including document libraries, lists, even things like email integration.

moss / microsoft office sharepoint server - this is an add-on to wss which gives some very cool additional features. some examples are excel services, infopath forms server, ecm (enterprise content management) and moss enterprise search. this is not free.

also you can't just install moss... moss is built on top of wss, so wss is a prerequisite.

web application / web app - also known as a virtual server (in sharepoint v2) and an web site / application pool (in iis), web apps allow for logical separation of sharepoint content. each web app runs under a different process on the iis web server.

two examples of different web apps are the central administrator site which runs on an arbitrary port number and then standard sharepoint sites which typically run on port 80. they both run under separate processes in iis.

site collection / top level site / parent site / spsite - a site collection is a web site that can contain sub-sites (aka webs), which all share the same owner and administrators of the top level site collection.

a site collection controls global settings that sub-sites underneath it inherit. settings can include permissions, storage quotas, and themes, etc...

webs / sub-sites / spweb - these are web sites that live underneath a site collection. these are almost the same thing as a site collection but the difference is the global settings that are applied from the site collection.

webs can have their own independent settings separate from its parent site collection, but it makes it a bit easier to manage for the owner of the site, that's the idea here.

lists - a list is a generic term used to define the different places to store content in a sharepoint site. some built in lists are document libraries (upload and share word docs), picture libraries (upload pics), and custom lists where you define what you want to store.

scenario time!

ok lets discuss one potential way to set up a sharepoint site.

imagine a organization of 60 people, and then 3 sub-teams of 20 people each.

a potential structure of the team's sharepoint site is a site collection that contains information for all 60 people (ie vacation calendars for the whole team).

then each of the 3 teams could have their own webs, for the smaller teams, which contain all the documents relative to each team.

the url structure could look like:

in this example orgsite is the site collection and team1site, team2site, team3site are the webs.

a picture is worth a thousand words so in my next post i will include some diagrams to better show the relationships.

Posted by skelley | 1 Comments

sharepoint alerts: how to repair them after a web app move

this past weekend we had a sharepoint farm where we needed to detach the databases and reattach them to a different web application.

this can break a lot of things, dr watson has a good list of some of them.

i'm going to focus on alerts and how to fix them (yep they break)

first of all there is a nice chunk of sample code to help fix this but its incomplete:

http://support.microsoft.com/default.aspx?scid=kb;en-us;936759

here's the piece of code we care about:

try
{
alert.Update();
//Change the Alert frequency back to the initial state.
alert.AlertFrequency = afPrevious;
alert.Update();
}
catch (Exception ex)
{
Console.WriteLine(" -> Error changing Alert. {0}", ex.Message);
} // inner try

basically all they're doing is changing a value in the alert, updating it, and then putting the value back to what it was.

what this essentially does is update a hard coded value called siteurl in the alerts table.

this works, kinda

the alerts will be sent out but the contents of the alerts will be referencing old links

that's bad news. heart_broken

so how do we fix it?

well lets start with the content database tables that deal with alerts:

immedsubscriptions contains immediate alerts
schedsubscriptions contains all other alerts (daily, weekly, etc)

ok so lets have a look there

select top 20 * from immedsubscriptions
select top 20 * from schedsubscriptions

hmmm the column called properties seems to have some hard coded urls...

(click to enlarge)

selectprops

pretty strange since there is another field in that table called siteurl

(fyi updating the siteurl in these tables is what the microsoft supplied sample code fixes)...

so why not just use that siteurl as the url in properties instead of hard coding it? 

no clue but they probably had a good reason. fingerscrossed

anyways now we have to see whats in that bag... the property bag.

string oldurl = "http://myoldwebapp"; 
string newurl = "http://mynewwebapp";
SPPropertyBag spprop = alert.Properties;

if (spprop["siteurl"] != null)
//fix hard coded properties
{
//Console.WriteLine("pre alert site url: " + spprop["siteurl"].ToString());
if(spprop["siteurl"].Contains(oldurl))
//found old link reference/
{
spprop["siteurl"] = spprop["siteurl"].Replace(oldurl, newurl);
//Console.WriteLine("post alert site url: " + spprop["siteurl"].ToString());
Console.WriteLine("Updated siteurl in properties");
countsiteurlprop++;
spprop.Update();
}
}

if (spprop["mobileurl"] != null)
{
//Console.WriteLine("alert mobile url: " + spprop["mobileurl"].ToString());

if (spprop["mobileurl"].Contains(oldurl))
//found old link reference/
{
spprop["mobileurl"] = spprop["mobileurl"].Replace(oldurl, newurl);
Console.WriteLine("Updated mobileurl in properties");
countmobileurlprop++;
spprop.Update();
}
}

how did i know to look there? (i'm sleeping with the property bag lady)

just kidding, it took a bit of tinkering considering that the properties field is not directly writable from an alerts object:

SPAlert Properties

Properties Gets the properties of the alert.

http://msdn2.microsoft.com/en-us/library/microsoft.sharepoint.spalert_properties.aspx

if you want to set the alert, you must use the property bag.

this concludes how to fix alerts after a webapp move. i hope you like.

Posted by skelley | 4 Comments
Filed under: , ,

welcome

the reason why i started this blog is because i found myself following this trend:

  1. start with a question about sharepoint
  2. search internet / blogs / etc... (you know the drill)
  3. dont find anything helpful, so conduct research
  4. figure it out but then the information is lost if its not written down and shared

the point is people usually follow this same order of finding answers, so why not add information to step 2?

steps 3-4 take quite a bit of time so its great when you can skip those.

plus its great when you find a really helpful blog somewhere.

btw a couple of things:

  • i'm not a big fan of into capital letters so dont expect to see a lot of those
  • i am a big fan of bulleted lists
  • blogs are so serious. i'm not planning to follow suit.

ok thats it for the welcome post. enjoy.

Posted by skelley | 1 Comments
Filed under:
 
Page view tracker