Architecture + Strategy

Musings from David Chou - Architect, Microsoft

January, 2009

  • Architecture + Strategy

    Cloud Computing and the Microsoft Platform


    It has been a couple of months since I wrote about cloud computing and Microsoft’s plans and strategies. Now that Azure Services Platform has been unveiled at PDC2008, and after having the opportunities to discuss it with a community of architects from major enterprises and startups via the Architect Council series of events, I can talk about cloud computing from the perspective of the Microsoft platform, and the architectural considerations that influenced its design and direction.


    Okay – cloud computing today is a really overloaded term, much more than SOA (service-oriented architecture) when it was the hottest ticket in IT. There are a lot of different perspectives on cloud computing, adding to the confusion and the hype. And unsurprisingly, there are a lot of confusion around Microsoft’s cloud platform too. So here is one way of looking at it.

    Microsoft’s cloud includes SaaS (Software-as-a-Service) offerings as shown in the top row of the above diagram, such as Windows Live and the Business Productivity Online Suite; and the PaaS (Platform-as-a-Service) offering currently branded as the Azure Services Platform. For the rest of this article we will focus on the Azure Services Platform, as it represents a platform on top of which additional capabilities can be developed, deployed, and managed.

    Comprehensive Software + Services Platform

    At Microsoft, we believe that the advent of cloud computing does not necessitate that existing (or legacy) IT assets be moved into the cloud, as it makes more sense to extend to the cloud as opposed to migrate to the cloud. We think that eventually, a hybrid world of on-premise software and cloud-based services will be the majority norm, although the balancing point between the two extremes may vary greatly among organizations of all types and sizes. As a platform company, Microsoft’s intention is to provide a platform that can support the wide range of scenarios in that hybrid world, spanning the spectrum of choices between on-premises software and cloud-based services.

    Thus Microsoft’s cloud platform, from this perspective, is not intended to replace the existing on-premises software products such as our suite of Windows Server products, but rather, completes the spectrum of choices and the capabilities required for a Software + Services model.

    Cloud Platform as a Next-Generation Internet-Scaled Application Environment

    So what is a cloud platform? It should provide an elastic compute environment that offers auto-scalability (small to massive), and ~100% availability. However, while some think that the compute environment means a server VM (virtual machine) allocation/provisioning facility that provides servers (i.e., Windows Servers, Linux Servers, Unix Servers, etc.) for administrators to deploy applications into, Microsoft’s approach with the Azure Services Platform is remarkably different.

    Azure Services Platform is intended to be a platform to support a “new class of applications” – cloud applications.

    On the other hand, the Azure Services Platform is not a different location to host our existing database-driven applications such as traditional ASP.NET web apps or third-party packaged applications deployed on Windows Server. Cloud applications are a different breed of applications. Now, the long-term roadmap does include capabilities to support Windows-Server-whichever-way-we-want-it, but I think the most interesting/innovative part is allowing us to architect and build cloud applications.

    To clarify, let us take a quick look at the range of options from an infrastructure perspective.

    The diagram above provides a simplified/generalized view of choices we have from a hosting perspective:

    • On-premises: represents the traditional model of purchasing/licensing and acquiring software, install them, and manage them in our own data centers
    • Hosted: represents the co-location or managed outsourced hosting services. For example, GoGrid, Amazon EC2, etc.
    • Cloud: represents cloud fabric that provides higher-level application containers and services. For example, Google App Engine, Amazon S3/SimpleDB/SQS, etc.

    From this perspective, “Hosted” represents services that provide servers-at-my-will, but we will interact with the server instances directly, and manage them at the server level so we can configure them to meet our requirements, and install/deploy applications and software just as we have done with existing on-premises software assets. These service providers manage the underlying infrastructure so we only have to worry about our servers, but not the engineering and management efforts required to achieve auto-scale and constant availability.

    “Cloud” moves the concerns even higher up the stack, where application teams only need to focus on managing the applications and specifying to the environment their security and management policies, and the cloud infrastructure will take care of everything else. These service providers manage the application runtimes, so we can focus on deploying and managing business capabilities, as well as higher-level and differentiating aspects such as user experience, information architecture, social communities, branding, etc.

    However, this does not mean that any one of these application deployment/hosting models is inherently better than the other. Yep, while most people look at “hosted” and “cloud” models as described here, both as cloud platforms, they are not necessarily more relevant than the on-premises model for all scenarios. These options all present varying trade-offs that we as architects need to understand, in order to make prudent choices when evaluating how to adopt or adapt to the cloud.

    Trade-Offs in the Cloud

    Let us take a closer look at the trade-offs between the on-premises model and the cloud (as differences between “hosted” and “cloud” models are comparatively less).

    At the highest level, we are looking at trade-offs between data consistency and scalability/availability. This is a fundamental difference between on-premises and cloud-based architectures, as “traditional” on-premises system architectures are optimized to provide near-real-time data consistency (sometimes at the cost of scalability and availability), whereas cloud-based architectures are optimized to provide scalability and availability (by compromising data consistency).

    One way to look at this, for example, is how we used to design and build systems using on-premises technologies. We used pessimistic locking, optimistic locking, two-phase commit, etc., methods to ensure proper handling of updates to a database via multiple threads. And this focus on ensuring the accuracy and integrity of the data was deemed one of the most important aspects in modern IT architectures. However, data consistency is achieved by compromising concurrency. For example, in DBMS design, the lowest transaction isolation level “serializable” means all transactions occur in a serial manner (in a way, single-threaded) which promises safe updates from multiple clients. But that adversely impacts performance and scalability in highly concurrent systems. Raising the isolation level helps to improve concurrency, but the database loses some control over data integrity.

    Furthermore, as we look at many of the Internet-scale applications, such as Amazon S3/SimpleDB, Google BigTable, and the open source Hadoop; their designs and approaches are very different from traditional on-premises RDBMS software. Their primary goal is to provide scalable and performant databases for extremely large data sets (lots of nodes and petabytes of data), which resulted in trading off some aspects of data integrity and required users to accommodate data that is “eventually consistent”.

    Amazon Web Services CTO, Werner Vogels, has recently updated his thoughts on “eventual consistency” in highly distributed and massively scaled architectures. An excellent read for more details behind the fundamental principles that contribute to this trade-off between the two models.

    Thus, on-premises and cloud-based architectures are optimized for different things. And that means on-premises platform are still relevant, for specific purposes, just as cloud-based architectures. We just need to understand the trade-offs so each can be used effectively for the right reasons.

    For example, an online retailer’s product catalog and storefront applications, which are published/shareable data that need absolute availability, are prime candidates to be built as cloud applications. However, once a shopping cart goes into checkout, then that process can be brought back into the on-premise architecture integrated with systems that handle order processing and fulfillment, billing, inventory control, account management, etc., which demand data accuracy and integrity.

    The Microsoft Platform

    I hope it’s kind of clear why Microsoft took this direction in building out the Azure Services Platform. For example, the underlying technologies used to implement Azure include Windows Server 2008, but Microsoft decided to call the compute capability Windows Azure, because it represents application containers that operate at a higher level in the stack, instead of Windows Server VM instances for us to use directly. In fact, it actually required more engineering effort this way, but the end result is a platform that provides extreme scalability and availability, the transparency of highly distributed and replicated processes and data, while hiding the complexities of the systems automation and management operations on top of a network of globally distributed data centers. This should help clarify, at a high level, as to how Azure can be used to extend existing/legacy on-premise assets, instead of being just another outsourced managed hosting location.

    Of course, this is only what this initial version of the platform looks like. From a long-term perspective, Microsoft does plan to increase parity between the on-premise and cloud-based platform components, especially from a development and programming model perspective, so that the applications can be more portable across the S+S spectrum. But the fundamental differences will still exist, which will help to articulate the distinct values provided by different parts of the platform.

    Thus the Azure Services Platform is intended for a “new class of applications”. Different from the traditional on-premise database-driven applications, the new class of “cloud applications” are increasingly more “services-driven”, as applications operate in a service-oriented environment, where data can be managed and provisioned as services by cloud-based database service providers such as Amazon S3/SimpleDB, Google MapReduce/BigTable, Azure SQL Services, Windows Azure Storage Services, etc., and capabilities integrated from other services running in the Web, provisioned by various private and public clouds. This type of applications inherently operate on an Internet scale, and are designed with a different set of fundamentals such as eventual consistency, idempotent processes, federated identity, services-based functional partitioning and composition (loose-coupling), isolation, parallel and replicated data and process architecture, etc.

    This post is part of a series of articles on cloud computing and related concepts.

  • Architecture + Strategy

    Microsoft Tag – Interactive Mobile Bar Codes


    One of the interesting items announced at the Consumer Electronics Show (CES) this week was Microsoft Tag ( It is a different kind of bar code, intended to be read by cameras in cell phones and mobile devices with Internet connectivity.

    tag2 by benjamingauthey.

    Each tag can be generated and managed on the Microsoft Tag website, and each one can be associated with a number of things, such as a URL (link to a website), free text, vCard (electronic business card), or dialer (call-out #).

    A reader application is required on the mobile device to interpret the tag, and interact with the Microsoft Tag service to find the associated information. It is available for the iPhone, BlackBerry 81xx/83xx/Bold, J2ME, Symbian S60-3E, and Windows Mobile 5/6. Point your mobile browser to and install.

    Of course, this isn’t anything new. Various types of codes have existed for a long time now. A Wikipedia entry on bar codes provides a nice overview, which also shows the HCCB (High Capacity Color Barcode) format Microsoft Tag uses. But now we have a free service to generate and manage them.

    There are a lot of applications for this, but in general mobile tagging provides a bridge between the physical and the online worlds. Anything a camera can see, can be embedded with a tag, and associated with a specific piece of content. For example, I pointed my cell phone at my screen to interpret the tag above, and it worked!

    Now technically, printing/showing URLs or related information on physical items would do pretty much the same, but mobile tagging automates the human interaction part of it, and we don’t have to read/process the printed information, then data input it into the device to retrieve the information. Now we just need to point the device at the tag, and it will display the associated information. It also means that while the physical tag may be static, the information it is associated can be dynamic. I can print a tag as my business card, and not worry about it being outdated or needing to update business contacts if any of my information changes.

    More information available at the Microsoft Tag website

  • Architecture + Strategy

    SOA – End of Life 2009.01.01


    It has just been a few days since Anne Thomas Manes at Burton Group published her post “SOA is Dead; Long Live Services”, and it has stirred up quite a storm of comments in the blogosphere. Most of what I read though, seem to be in alignment with what Anne Thomas Manes said -

    SOA met its demise on January 1, 2009, when it was wiped out by the catastrophic impact of the economic recession. SOA is survived by its offspring: mashups, BPM, SaaS, Cloud Computing, and all other architectural approaches that depend on “services”.

    Her article clarified that it is the “SOA as we know” (and the terminology used) has faded into irrelevance, the SOA that called for a comprehensive transformation of an organization’s view and management of its portfolio of data, technology, process, and people. Indeed, many people (such as David Linthicum, Eric Roch, JP Morgenthal, and the ongoing debate on InfoQ), for a number of years now, have been cataloguing why most enterprise SOA efforts fail miserably.

    In general, I think the community is coming to the realization that SOA really is an architectural approach, not a set of technologies to implement. From that perspective Microsoft actually has been spot-on in terms of not offering “SOA”-branded products, but instead advocating customers to carefully design and build the right type of SOA for their organizations.

    While there are many, many identified technical reasons why most SOA projects don’t succeed, Mike Kavis in his post has articulated one perspective nicely at a high level (just summarizing his list here):

      1. We think process is a bad thing and it slows us down
      2. We are impatient
      3. We don’t understand what an architect really is
      4. We don’t understand what architecture really is
      5. We lose sight of the value and argue semantics
      6. We lack leadership skills and emotional intelligence

    This highlights one area why SOA hasn’t been successful: the human factor. But this doesn’t only apply to SOA; it’s just that the SOA requirements for organizational transformation and consistency amplify issues associated with the human factor. So what aspects of the human factor that make SOA difficult to implement?

    Lack of patience, persistence, and perseverance

    I think this is applicable on many levels. SOA requires a long-term, incremental build approach, but many projects are required to justify immediate or short-term ROI. Or from a different perspective, people just naturally expect to see some form of immediate benefits, and lose interest/motivation when the reality of SOA hits after the first few initial projects, which are often ESB-driven infrastructure optimization efforts, or point-to-point integration efforts. The lack of immediate business agility and cost savings gives people excuses to question the approach, reduce level of support, etc.

    There is the aspect of jumping on the bandwagon simply because SOA was the acronym du jour and that it seemed smart to talk about it, without investing sufficient research, discipline, and due diligence to do it right. Of course, those who are impatient to jump on the bandwagon would just as (if not more) quickly to jump off at the first sign of trouble. There was sufficient intention or willingness to invest in SOA endeavors, but the impatience of not acquiring necessary expertise resulted in failures.

    And truth is, SOA is not easy. Many organizations lose sight of the most important aspect of SOA - “how” to do SOA, not “what” we do SOA with. To many organizations it just seems simpler to follow marketing hype and implement products that are branded as SOA suites and think that an SOA can be constructed using the new infrastructure.

    Resistance to change

    People are naturally resistant to change, especially tough changes like SOA. Large organizations stand to gain more from SOA, but at the same time, those large organizations that have operated for many years in traditional functional silos have always resisted enterprise-level efforts that require them to build more dependencies on shared resources.

    And SOA meant changes across all aspects of IT disciplines as we know. Operationally, traditional SLA management processes need to be adapted as downstream systems may need to inherit availability and performance requirements from upstream systems. Design-wise, it’s not just about exposing functionality as services, but more in the context of how a function is useful from the enterprise’s perspective; but that requires a higher level of collaboration beyond one department’s development teams.

    Also, distributed computing is not simple. When we build process-level dependencies on other systems, efforts required to troubleshoot issues that one does not have full control over become magnitudes more challenging. This requires a major adjustment from the ways IT teams work today, and in those cases it’s often easier to point the finger at others first.

    Organizational dynamics

    The above aspects often apply to individuals. But when we look at an organization as a whole, the effects are also amplified. SOA requires a higher level of collaboration between teams in an organization. Each team or department used to having a higher level of autonomy in terms of managing their budgets, schedules, clients, technology, etc.; relatively independently from other teams. How to find the right balance between organizational consistency and flexibility with sufficient local autonomy, is unfortunately in itself requiring a uniform understanding and approach within the organization.

    And politics. Not everyone likes to work with everyone, and individuals used to be relatively shielded within their own teams/departments/silos. But SOA requires breaking down the walls of silos, and can expose people more to personalities they may not like to work with, causing more contention among people. If not managed carefully, such as not positively reinforcing the correct behaviors, this can more quickly send the wrong signals to workers and hinder progress.

    Lastly, strategic thinkers who understand what it takes to do SOA, tend to be the minority in today’s IT organizations which typically focus on tactical goals and are also measured as such. It’s just difficult for a few individuals to influence and steer an organization to adapt new changes.

    So what now?

    Thus it’s the “how” we do SOA that is the most important. It is the architectural disciplines, organizational cohesion, strategic leadership, etc. that most significantly impact the outcome of SOA efforts. And from that perspective the architectural principles of service-oriented architecture are still sound. In fact, as many people are already jumping into the next new big things such as cloud computing, “service”-oriented or driven concepts and considerations become much more important than before. Furthermore, cloud computing, in my opinion, has to do more about services than simply moving existing on-premises infrastructure into a utility-based cloud.

    Perhaps it’s time to take a hard look at each SOA effort and ask the hard questions. Is it really meaningful, or valuable, to do SOA for your organization? Does your enterprise really need real-time process-driven integration, or traditional data integration, or a hybrid model, can suffice? Can people in your organization work collaboratively towards common goals and standards? Does your organization have what it takes to undergo and withstand such transformation? And so forth.

    This doesn’t mean we should stop doing SOA (and regardless of what name we use to call it), but we should do it for the right reasons, and do it right. And more importantly, having the right people in the right places to see the plan through. This means having the right skills to plan and lead an organization to transform all aspects of data, technology, processes, and people (in knowing how to deal with the human factors mentioned above). There are still very significant benefits that SOA can bring, evident in the few organizations that have been successful with it.

    Another lesson that can be learned from this ongoing discussion is that, SOA was deemed unsuccessful because it presented very significant gaps to the way existing IT organizations work today (such as what Dana Gardner mentioned in his post). However, we have to be careful in thinking that “abandoning SOA” and moving on to the next big thing – cloud computing, will solve all of our issues (doesn’t that sound eerily familiar?). It is evident that these gaps present such large gaps to many organizations, that the organizational aspects become the biggest impediments to progress. As technologists it is easy for us to say that the next major innovative technology trend will bring about sweeping changes and transformational benefits. But the reality is, cloud computing, as an extension of SOA, will require even more maturity and competencies in working with SOA to implement successfully. Indeed cloud computing presents a more prominent influencing factor to transform legacy IT, but it won’t make it any easier than SOA did. Organizations that want to take advantage of these transformative technology trends need to not only understand the technologies involved, but really pay attention to planning the organizational and people side of the endeavors.

  • Architecture + Strategy

    Series - Cloud Computing and Microsoft


    This is an ongoing series of posts and studies on cloud computing and related concepts, and the Microsoft platform and strategies. This post will serve as an index of related posts on this blog, and will be updated periodically.

    Cloud Computing

    Service-Oriented Architecture

    Web 2.0

    Software Plus Services

  • Architecture + Strategy

    Event – XamlFest on January 14 & 15, 2009 at Irvine, California


    Are you interested about WPF (Windows Presentation Foundation) but concerned about the learning curve?  Have you seen Silverlight but don’t know where to get started?  Or are you curious about how tools like Visual Studio and Expression Blend help designers and developers work together to deliver great user experiences? If so, join us at XamlFest!

    XamlFest is a two day interactive event where you’ll learn about the platforms the tools and processes used to deliver differentiated user experiences. It’s a chance for you to mingle with UX minded Microsoft folks as well as industry leading design integrators.  It’s also an opportunity to pick up a free copy of Visual Studio 2008 and Expression Studio 2 for you attendance.

    More details are available at

Page 1 of 2 (8 items) 12