Determining the Global Deployment Approach
I was on a call the other day with a customer and they were looking for prescriptive guidance for their global company. What was the right approach? What's the formula? How can I determine? Since the Geo Whitepaper hasn't been rev'ed yet. I figured I'd provide my thoughts on the topic in the mean time. This is fictitious, but very real at the same time. I hope it helps you understand how to figure out a global deployment, what matters and break down the different approaches. There are obviously many variations, but these are the most common deployment types that I've come across in an Intranet global multi-national deployment.
Centralized
In my previous post, I talked about how the centralized deployment is the holy grail for the server/farm administrator. It obviously is the easiest to maintain. With a good recovery strategy like SQL log shipping or SQL 2005 database mirroring (planned to be supported when support team has the documentation (give it 3 months)) or Clustering and remote snapshots, or take your pick! The centralized deployment would be the easiest to maintain. Having a single central admin, and single place to administer enterprise search is obviously the easiest. The down side which should be obvious is client perf. To support the multiple languages, the WSS and MOSS language packs for the top 6 languages (based on internal population) have been deployed. The central portal is in English and varations are being considered for the next rev.
Contoso Pharma has 7,000 users in their global deployment with 50 hospitals spread across the globe with headquarters in New York which includes 35% of the company and a secondary data center in New Jersey. Most of them are in the US, but there is a very large Offices in Tokyo and Singapore. They also have a bunch of medium sized offices in each of the European countries. Latin America has plants where their pills are manufactured. The plant workers will not be authoring any content only viewing the records where the ingredients are posted. They determined their offices where there are 50 or more real "information workers" they need to be able to download 80% of files on average of 30 seconds or less. They called less than 30 seconds, green, 30-60 yellow, and over 60 red or unacceptible. In their testing of performance they analyzed their file sizes and found that all file types averaged out to 200K with word files on average around 100K. Their PPT files were on average around 2MB. They decided to use the 2MB PPT files to determine their "least acceptible level of performance." In their analysis they found that the plants in Latin America were getting unacceptible levels of performance with the 2MB download, but when they tested the page load times, they were under 10 seconds and decided to have a different average page download SLA for the plants and factories. In Europe the results were mixed, but very close in London and Paris, but over in Prague and Barcelona. In Asia they found the Hong Kong offices ok. When comparing results they found that utilization played a factor where bandwidth was low such as in the 512K link to the Johanesburg office.
As a result, they determined to do the following: Create internet connected offices where network connectivity was expensive to keep on the network and create VPN tunnels to the corporate network and keep 1MB as the minimum bandwidth for Offices with more than 10 Information Workers. In Offices with 50 or more IWs (Information workers) they decided to purchase WAN acceleration devices and worked with the network team to establish a support agreement. The WAN accleration team supplied testing and worked directly with the network team to configure and setup after testing in the corporate labs with WAN accelerators. A key factor in their support was that all offices with 100 IWs or more in the office was to have redundancy for failover support for the devices.
North American offices were all at acceptible levels, but they established an SLA to keep Utilization and look into QoS or Quality of Service for keeping the lines with less than 25 utilization and upgrading a few links such as Phoenix where the link was saturated. Latency was added as an SLA to keep all US offices at 50 or less ms.
South America changed some offices over to internet connected with some satellite links, and some DSL links to the plants. Overall they were able to save money and increase the speed by switching over. Minimum bandwidth was established at 512K for every 25 plant workers. Some HR Forms scenarios with HR determined to fit the needs here with their minimum performance levels.
Asia put WAN accelerators in Offices with 25 or more IWs and where there were less they kept them the same, but established a minimum of 300 ms of latency across all links ensuring they were in yellow or better.
Europe and Africa - In Africa all links were brought up to 512K and 300ms and up to 1MB for offices where there were more than 25 IWs. In Europe, almost all offices got WAN accelerators and failover support since most had 100 IWs or more. The links were upgraded to 2MB as a minimum with most at 4MB or more with average latency around 150-200.
Conclusion.
SLAs: Network SLAs stated that 512K was a minimum bandwidth level for Contoso world wide. Maximum latency in Offices with 25 or more IWs was 250ms. Many plants were converted over to internet connected, meaning using local ISPs having the network team establish VPN tunnels over those links saving a bunch. Although some new Satelite connections came up in some remote offices, the latency did not apply, but those links where there were more than 25 IWs, the bandwidth was 1MB. WAN Accelerator Devices were not used in remote offices in Africa or South America with the exception of the large office in Johanesburg, most of these were plants anyway and were mostly navigating pages and had low levels of usage. Having the network team support the WAN accelerators did influence the purchasing decision, but it was a great partnership and they are now working very closely together to monitor the performance of the network. In the end all offices of 50 or more IWs are in the green (30 seconds to download a 2MB PPT) +90% of the time, and Offices with 25% or more are at least in the yellow or green, and plants have mostly a yellow experience, with exceptions for the plants which have a 10 second SLA on pages on the HR site. MOM web sites and services was deployed to existing MOM deployments in Europe/london, and Asia/HongKong to ping the specific pages to report on SLAs which are shared and reviewed with the network team on a Monthly basis. The Network team brings utlization, and network maps to the monthly meeting.
Hardware & Ops team. With the consolidation, the total cost was 35K for hardware for the SharePoint farm supporting 1TB of databases in a 2 WFE/Query, 1 index, 2 node SQL cluster with log shipping to a single farm box with 6 300GB disks in New Jersey which will be used for read only. The ops team is 1 guy with SQL administration experience (failover clustering, backups, SQL perf tunning) and web administration (IIS, and other) with 2 years of experience with CMS, SPS or WSS. (some ASP.NET experience helpful). A Project Manager has half his time on this Intranet deployment with the other half on other projects. He is now looking at scoping out an extranet deployment, and eyeing up the Internet site which is currently very flat with difficultly with authoring and hosted by an ISP. The two people which currently support the extranet apps and occasionally change out pages on the internet site for HR are looking at a future reorg to bring all these people in the same team and move all these platforms to MOSS. HR, Marketing, and Finance can't wait for this to happen. The PM is looking at all this and seeing a future of a GPM role for himself. The 1 year road map brings these teams together and they will be looking at workflow and forms on the internet and involving the business in the scoping of the extranet deployment which appears to have millions of potential cost savings where much is done offline and in paper sent back and forth taking days... and introduce new scenarios for each of these departments.
Regional
Adventureworks is a large multi national company with 100,000 employees. They have 500 field offices, but their New York, London, and Tokyo offices have 75% of the company with splinter offices throughout the world. The network team has 3 regional data centers that line up with each of these 3 hubs. Hub actually describes what the network maps look like. As a result of data center consolidation from Exchange/Mail server consolidations in the past, they have moved all their services with the exception of some NAS File Server Storage devices used for product distribution and some for collaboration in some of the larger field offices. When working with the network team, the have already agreed that they should look at providing a good experience to these key Offices. The network team is concerned about performance in the larger offices in the midwest and L.A. In their evaluations they have established that there most common file types are Office and PDF. The average size file type is 200K. They want a 5 second page load as a minimum SLA in Offices with 50 IWs or more.
The network team and SharePoint team took this to the lab with a WAN simulator and determined that in order to get their SLAs they needed 1MB and 200ms or less to keep their SLA and the SharePoint servers. Utilization did impact the lowest links, and if utilization was over 50% the SLA would not be met. Looking at their maps the following is how it broke down...
Compliance will need to be enabled world wide to ensure local compliance concerns are addressed in the regional farms. An audit and review of the plan to meet both corporate requirements and regional needs will be reviewed with a virtual team twice a year.
North America
With New York as the data center and L.A. as the second largest office in North America the team was concerned about the performance. After doing testing, they found that most of the day they were able to meet the SLA, but during peak usage the SLA was pushed. The network team was pushing really hard for a replica in L.A. This publishing only single box deployment was not backed up, but an SSP was created for local search with content management features enabled. After negotiation and investigating third party solutions, they decided to do 1 way publishing with out of the box publishing features for the HR handbook, and a few select sites, to have a small deployment in SLA as a leg. The SharePoint team decided to create a special Web Application with these select sites and created mapped paths and regular recurring deployment jobs which published nightly. Establishing a QoS they could keep their SLA, but the network team wasn't ready for this. They looked at WAN accelerators and decided that when the current network investments in that Office expired they would look at updating them to WAN Accelerators or at least capable. Budget was a concern since the network team didn't have budget for a device. They'll revist in 2 years. Overall with the new established minimum levels of performance only 1 field office connections as a result of an office expansion in New Mexico.
Collaborative file shares across the company were determined to be consolidated to the companies SharePoint platform. Product distribution would be more established as a service run by one team the SMS team, and the run away services in the past would be locked down. All file services would be locked down and requests would go through the SMS/Product distribution/Desktop team.
As far as the SharePoint hardware goes, they deployed a 2 WFE/Query, 1 Index, 2 SQL 2005 nodes with remote snapshots.
South America
After analyzing the different offices, they found that performance was horrible. The links were badly maintained, frequent outages, and super high latency. The high latency did end up being satelite links. The largest office was in Brazil. With over 100 IWs, they decided they needed to do something. After talking and establishing a v-team they found that most of them were collaborating with people in the US anyway, and that performance they were seeing was normal. They were extremely polite and tolerant. After this great team building they decided to upgrade the link and investigate cheaper means to better bandwith with the network team. They asked for bids. As a result they were able to cut the price in half while doubling the bandwidth. This wasn't always the case. In other offices they decided to figure out how to keep at least 512K lines and hold the ISPs responsible for outages. The network team was encouraging establishing a specific company for purchasing bandwidth/lines across South America. As a result a single provider was chosen (the one that gave us the huge discount) and many links were upgraded and an SLA was established with consequences. Utilization on this small links was established that this would be the biggest factor in performance and utlization reports would be provided by the vendor, and investigation would be performed by the network team. After a month, investigation found that updates to desktops and reinstalls/package distribution was taking up most of the bandwidth (causing ulitization issues). NAS devices were placed and schedules were established to keep these distributions from happening during the day.
Europe/Africa
Overall the different offices were analyzed. Some utilization concerns were looked into and a strange app was located that was downloading large videos from the internet. After shutting this off performance testing and analysis resumed. The French and German subsidaries did some performance testing to the London office during a pilot. They found it OK, but wanted to see better performance. As a result the SharePoint team looked at optimizing the home page for the Intranet portal and they retested. They were all pleased with the improvements. After this strange app came up, the SharePoint team talked to the network team about setting up a recurring meeting to evaluate utlization and latency. After the first 2 meetings they came out with 200ms of latency as the minimum bar and found they could meet this in all offices. Keeping an eye on utilization was the biggest concern.
African offices were analyzed and none were large enough to meet the required minimum performance levels. They did figure out that the 256K links were fine for navigating the web site for pages, but file downloads were long running, but with progress bars, the small offices would "deal" with the changes. Desktop storage would be used for product distribution and patching after hours, and otherwise shipped when new OS's/Office updates or packages over 200MB were needed.
Asia
What a mixed bag. The latency was as high as 350ms in Cambodia and Laos. The latency was 250 ms from Shanghai. They found they could increase the bandwidth to meet SLA. By keeping 2MB as the minimum bandwidth all offices were able to reach the minimum performance level to Tokyo. Some network links were optimized to Tokyo as some were being routed improperly back to New York before reaching Tokyo. After the optimization and upgrades, the region was somewhat happy. Language was now a problem. In establishing the collaboration sites, they found that some of the teams wanted Chinese and some wanted Japanese. Using the out of the box features they could support both, but during the pilot they found that some countries wouldn't collaborate if the UI of the site was not in their native language or so it seemed. When they used english for the collaboration sites, both the Chinese and the Japanese were happier (not fully happy, but found it more acceptible). For the home page of the intranet the Chinese and Japanese agreed that they would establish authors to translate the content and they created variations. This became an option for th subsidaries. If they would supply authors, the SharePoint team would create variations. The marketing team which was managing the content created a virtual team and created workflows with scheduled processes for updating the Intranet news content and announcements.
Services
A single SSP was created in each of the 3 main deployments, with 1 special SSP in London for a custom BDC application for World Wide Sales, a web app that is used globally (mostly pages and customer information with caching on static web parts and layouts).
Portal Navigation - Global navigation across the 3 deployments will include links to London for enterprise search. Top sites in each location will be included in the site directory of each portal with local highlights and special taxonomy included in the Site directory. A business taxonomy team will work with the SharePoint team to maintain the site directory and work with them on keeping relevancy on the search environments.
Enterprise Search was the biggest concern. Since the largest office was in London. They decided that London would be where the main indexing would come from. Schedules would be established and the number of threads would be reduced to 4. SLAs around indexing of the regional deployments would be set to weekly. (The weekly SLA means all content needs to show up at least within a week, but obviously should be daily with the schedule of daily incremental updates with an 8 hour window for indexing during off hours.) Due to the large amounts of data in each of the 3 farms, an Index/WFE (Index Target and Indexing function shared)would be established in each location. The indexing would be balanced between local indexing which would be basically continuous locally, and off hours for when it was indexed from London. The network team was informed about this plan and will keep an eye on the specific servers. To make their plan work, they added entries to the hosts files on the London Index server with all the web apps to the specific IPs of the Index/WFE.
Profiles & People. For profiles, the profiles were indexed locally to provide people search everywhere. The company directory with pictures (coming from a custom source) was implemented in each of the 3 deployments.
My Sites - My sites were established as within each region. A special web app was created to host My sites in each region. A special page leveraging web services was created to list all created my sites with an alphabetal directory and search results page (hitting the profile info).
Excel Services - Excel services was enabled on the Finance web application, and only by special request on the collaboration sites. Two servers were added to the London farm to offload performance of the finance's heavy calculation apps. They are using the services to analyze and provide charts and trending on specific industry approaches, something that use to be a number of linked excel files run on file servers. After some analysis the app itself is worth hundreds of thousands, and every hour they carve off saves the company thousands. Compute clusters are being investigated. Sales does use Excel services to display sales results and top customer and sales reps. Analysis services and reporting services implementations and integration are being investigated. A separate deployment in London with integration and consuming from the SSP is likely the result. Note: Consuming from an SSP does not require you to use their Excel servers!
Conclusion.
Although in the beginning there were many doubts, very little needed to be changed to support the regional deployment. Future investigations in a multi-master of New York and L.A. for many sites is highly desired and at budget reviews third parties may be considered. For now one way publishing meets most needs. Variations and language considerations which previously were not as high priority have been moved to the front and center with some unlikely results. Additional coordination was needed to figure out workflows and deployment considerations with cultural issues being taken much more seriously.
Distributed
Fabricam is a non profit organizations with offices in every country with non profit volunteers and a part time and volunteer network support team. Working with Microsoft, they have determined they would like to deploy WSS 3.0 to leverage the collaboration functionality. Today file servers with a mix mash of documentation and content around processes and information is spread across Linux Apache PHP apps, FrontPage 98 IIS web sites and file servers. To say the least, it's very difficult to find with hundreds of web servers across a huge variety of applications. They would like to replace all these custom deployments of HTML, PHP, ASP/custom apps with an "out of the box" easy to use and easy to deploy. The network is non existent although the common element is the internet which is the network in many cases.
Within each country they would have a single WSS server leveraging the Windows internal database engine where the deployment would be 50 or less people using it. Backup would be local to disk once a week using stsadm scheduled job with daily diffs and weekly fulls, the fulls would then be copied to a regional server once a week. Not all offices are connected to the other offices except over the internet. These servers are connected to the internet with 512K lines and administered remotely where possible and volunteers verify backups. A single top level WSS site has a description of the service with a discussion list for support, and a list for FAQs and contact list with numbers for emergencies, this site is fairly locked down.
Countries with more than 5000 people using the servers or when a disaster happens a 2 node deployment is brought up with SQL server on one of the nodes and Indexing on the other node using NLB for load balancing.
In the US all of the offices are connected and a deployment in the west coast and a deployment in the east coast are connected with a 2MB pipe. Bandwidth is basically 256 where there are lines, and many offices have dial up to the internet.
A single MOSS 2007 deployment of a donated single 4 proc server with 4GB RAM is scheduled to be deployed to the head quarters where one web app is internet facing with forms auth and anonymous where forms are deployed to collect donations, and sign up and schedule volunteers. Internally an enterprise search is provided with a directory of sites is maintained for the different cross country collaboration sites to coordinate the work. Obviously not all sites can be indexed daily, but an effort is made to keep search up to date on a weekly basis. Volunteers watch the gatherer logs and cater the schedules.
The non profit org is extremely happy with the out of the box functionality and super excited about the possiblities with what they can do with the content management, authoring/publishing, forms, workflows, records management, etc... Although with over 50 WSS servers in 40 locations it may seem like a nightmare, the team has consolidated services from more than 100 rogue unmanaged, un backed up, solutions with thousands of different interfaces and no coordination or common platform. The FrontPage authoring has moved to out of the box IE design or SharePoint Designer, and most has moved to a common look and feel with master pages. Little of what was custom development needs to happen any longer. Although the clients have not yet been upgraded the server interfaces are consistent and although single sign on and AD is not yet a possibility in all locations due to lack of connectivity they are looking at ADAM and consolidating their LDAP directories to soon provide a single username and password. The flexible authentication has allowed them to quickly move to WSS 3.0 with the various auth schemes, with little custom work to build a provider. The blogs and wikis have really taken off and have consolidated TONS of what would have taken hours of coordination that now is done adhoc. Migration continues as collaborative shares, IIS & Frontpage web sites, Apache and PHP apps are moved to leverage out of the box functionality in WSS and MOSS 2007.
Ops teams? This ops team is 2 part time workers/volunteers that meet with the governing body. They did bring in a consultant to do the MOSS deployment and help them get it up and going, but they ended up donating his time afterward.