Architecture + Strategy

Musings from David Chou - Architect, Microsoft

  • Architecture + Strategy

    Crowdsourcing Discussion at Caltech | June 11, 2011


    Crowdsourcing, as a method of leveraging the massive pool of online users as resources, has become a viable set of techniques, tools, and marketplaces for organizations and entrepreneurs to drive innovation and generate value. It is applied effectively in a variety of explicit/implicit, and systematic/opportunistic models: collective intelligence (Wikipedia, Yelp, and Twitter analytics); collaborative filtering (Amazon’s product and Netflix’s movie recommendation engines); social tagging (, StumbleUpon, and Digg); social collaboration (Amazon’s Mechanical Turk); and crowdfunding (disaster relief, political campaigns, micropatronage, startup and non-profit funding, etc.)

    An entrepreneur can utilize crowdsourcing tools for funding and monetization, task execution, and market analysis; or implement crowdsourcing as a technique for data cleansing and filtering, derived intelligence, etc. However, leveraging crowdsourcing also requires an entrepreneur to navigate a complex landscape of questions. How is it different from outsourcing? Is it truly cost-efficient? What motivates individual contributions? How to grow and sustain an active community? How to ensure quality or service level? What are the legal and political implications? Join us for a program which explores such questions, and the use of crowdsourcing to match specific needs to an available community.

    Come interact with the local community, and an esteemed group of speakers which includes Peter Coffee (VP, Head of Platform Research at, Dana Mauriello (Founder and President at ProFounder), Michael Peshkam (Founder and CEO at IamINC), Nestor Portillo (Worldwide Director, Community & Online Support at Microsoft), Arvind Puri (VP, Interactive at Green Dot), and Alon Shwartz (Co-Founder & CTO at

    June 11 9am-11am. Visit the website for more details and registration information -

  • Architecture + Strategy

    Microsoft Implementing Software Plus Services


    Microsoft has been talking about "Software + Services" (S+S) as its vision of the future for a while now (see related posts on S+S: Microsoft Platform Overview & Talking about Software Plus Services). People like Bill Gates and Ray Ozzie often talk about the applicable patterns and trends that exemplify this concept, even though they don't always mention the moniker.

    And Microsoft's execution on this direction is quite visible too. From continued investments on the desktop and enterprise software, to the latest and still growing cloud platform that brings many of the traditional capabilities into the Web.


    For example, many of the enterprise servers - Exchange, SharePoint, Office Communications, and eventually Biztalk and SQL Server as well, are all being implemented as services in the cloud that users can use directly, without investing in their own physical infrastructures to host and manage them. There are also a lot of progress being made in the consumer space in the form of Windows Live services.

    However, a major value proposition in S+S is the ability to integrate traditional software with distributed services, and bring the best of both worlds together. What has Microsoft done so far to implement that S+S vision?

    Basically, many efforts are happening across the board. Some of the more visible ones include:

    Exchange - it supports multiple delivery means (hosted on-premise, outsourced hosting/management by a partner, and cloud-based service from Microsoft), it supports many clients (Outlook, OWA, Outlook Mobile, Outlook Voice Access), multiple licensing models - traditional perpetual and subscription; plus itself can be a consumer of attached services such as Forefront spam/filtering services


    Office System - Office clients combined with SharePoint server represents a business productivity platform (client-server interaction and leveraging the many valuable enterprise services in SharePoint such as enterprise search, content management, business data catalog, business intelligence, etc.). Excel spreadsheets can be published into SharePoint and then provisioned as web services, InfoPath forms, stored as part of SharePoint’s InfoPath services, can be rendered on InfoPath clients but can also be rendered directly from SharePoint as forms services. Office clients themselves can also be extended with .NET to connect to back-end systems whether directly or via SharePoint or Biztalk. For example, Office Live Workspaces which is a cloud-based SharePoint service for consumers, SharePoint Online for businesses, etc.


    SharePoint - SharePoint Server itself can be deployed on-premise, outsourced hosting, or accessed as a subscription service from Microsoft (SharePoint Online). It also has many other flavors such as Office Live, Office Live Workspaces that live in the cloud as services for consumers to use

    Windows Live - known as a set of cloud-based services, but Microsoft has also delivered a set of client-side software (Mail, Messenger, PhotoGallery, Toolbar, Writer) to improve the user experience, in addition to the browser-based interfaces. Also many of the services offer API’s for people to build applications with.


    Office Communications Server - similar to Exchange, it now also has a cloud-based service for people to use (Office Communications Online), plus API's for developers to build specific branding and user experiences

    Duet - a product that integrates Microsoft Office with SAP. Basically users can use the Office clients as the UI to SAP services

    Xbox - Xbox Live is one of the first examples of S+S

    Dynamics - similar model to Exchange - multiple deployment/delivery models, licensing models, and client access channels

    Windows - Windows Update is a componentized client and cloud-based service interaction model; similar is OneCare

    These examples all demonstrate the fundamental principles of S+S:


    One recent offering that is particularly interesting, is Windows Live Workspaces ( This service offering, in a way, is Microsoft's response to Google Apps. Instead of converting the Office client software suite (Outlook, Word, Excel, PowerPoint, Groove, OneNote, Visio, InfoPath, Access, etc.) into browser-based solutions to compete head-on with Google Apps, Windows Live Workspaces was delivered to offer the sharing and collaborating capabilities that have been cited as the biggest shortcoming when using the Office clients.

    Now Microsoft actually has been delivering SharePoint services for a number of years now to provide that file sharing and collaboration scenarios for workgroups and enterprises. But there was a gap for consumers and inter-organizational scenarios that traditional SharePoint deployments (inside the firewalls) don't address very well.

    Thus Windows Live Workspaces is still built on SharePoint, but has been designed specifically to support consumer and end-user collaboration. It provides capabilities for fine-grained document-level access control, ubiquitous access, cloud-based storage, and client-side add-on's that integrate directly into the Office clients. So users can create/open/save documents into Windows Live Workspace directly from Word or Excel, for example. And of course, user always have the option to save documents locally until they're ready to share with other people.

    This approach illustrates the S+S approach by leveraging best of both worlds. Rich client-side software (criticized as bloatware sometimes but it can also be perceived as having the capabilities ready-to-use regardless of where a user is; having internet access or not) that fully leverages the power of the client device platform to maximize individual productivity, while leveraging cloud-based platforms for sharing and collaborating with others to maximize group productivity.

  • Architecture + Strategy

    Active Directory and BizTalk in the Cloud?


    A colleague pointed me to an interesting blog post – Two products Microsoft should set free into Cloud, which ended with this question:

    So Microsoft – here is a market that is begging to be served and yours to lose. While you still have work to do to make your to Azure Platform, Business Applications, Office Suite widely adopted in Cloud, BizTalk and Active Directory are the need of the hour and are ready to go. So waste no more time – let them free and watch them soar in Cloud.

    Now, if cloud computing is simply outsourced hosting, then Microsoft could just start selling Active Directory and BizTalk as a SaaS offering today. But I tend to think that cloud computing represents a new paradigm (basically, more distributed computing than utility computing), and more value can be gained by leveraging cloud as a new paradigm.

    Below is the rather lengthy comment I left on that blog.

    Active Directory and BizTalk not being part of the Microsoft cloud platform today (either in SaaS or PaaS model) doesn’t mean Microsoft doesn’t want to “set them free into cloud”. In fact, our long-term roadmap has been to make all of our software products and platforms available in the cloud in some form.

    So then why haven’t we? Shouldn’t it be pretty simple to deploy instances of Active Directory and BizTalk in Microsoft data centers and let customers use them, a-la-SaaS-style? The answer lies in the fundamental question – is cloud computing simply server hosting in other people’s data centers, or is it a new paradigm we can leverage to do things differently?

    Microsoft’s approach to cloud computing is exactly that – provide the right solutions for cloud computing to effectively support the new paradigm. For example, as today you can see that in Microsoft’s SaaS offerings, there are both single-tenant and multi-tenant versions of Exchange, SharePoint, Office Communications Online suites; and in the PaaS offerings, SQL Azure is a fully multi-tenant relational database service and not simply hosted SQL Server, and Windows Azure’s native roles are provided via a higher abstraction, container-like model, and not simply hosted Windows Server.

    So then the question is, what’s the right cloud model for Active Directory? That is still under consideration, but my personal opinion is that we still need to carefully evaluate a couple of factors:

    • Do customers really want to outsource their identity management solution? Is there really a lot of demand for hosted enterprise identity management services?
    • What are the true benefits of hosting the identity management solution elsewhere? Just some cost savings from managing your own servers? That might be the case for smaller companies but larger organizations prefer the private cloud approach
    • For example, the identity management solution is essential in managing access control across an IT architecture. Wouldn’t it work better if it’s maintained closer, in terms of proximity, to the assets it’s intended to manage? Keep in mind that most “pure cloud” vendors who advocate otherwise, use their own identity management infrastructure hosted in their own data centers
    • And from an external, hybrid cloud, and B2B integration perspective, identity federation works pretty well to enable single sign-on across resources deployed in separate data centers and security domains
    • Lastly, what’s the right model for cloud-based identity management solution? Is it making the online identity metasystem more “enterprise-like”, such as adding some of the fine-grained management capabilities to the Live ID infrastructure, or developing a multi-tenant version of Active Directory that can better address some of the consumer identity scenarios?

    Similarly for BizTalk, many of the above points apply as well for its cloud aspirations, plus a few specific ones (again just my personal opinion):

    • Process and data integration between organizations (such as traditional B2B scenarios) and different cloud-based services operated by separate organizations, is a lot different from traditional enterprise integration scenarios where enterprise service bus type of solutions fit in today. It has a lot more to do with service management, tracking, and orchestration in an increasingly more service-oriented manner; as opposed to having system and application-specific adapters to enable communication
    • Also, EAI and ESB type of integration places the center of gravity in terms of context and entity definition within one enterprise. Cloud-based integration, such as outsourced process management, multi-enterprise integration, etc.; shifts the center of gravity into the cloud and in a much more shared/federated manner
    • Question then is, what is the right type of integration-as-a-service solution that would work well for cloud-based integration scenarios? We have many integration hub service offerings today, many grew from their EDI/VAN, managed FTP, B2B, supply chain management, e-commerce, and RosettaNet, ebXML, HL7 roots. The landscape for external integration is vastly more diverse and generic (in each vertical) than any one organization’s way of managing processes
    • Some initial direction can be observed in Windows Azure AppFabric today, with the Service Bus offering. It works as an Internet service bus to help facilitate communication regardless of network topologies. It advocates a federated application model in a distributed environment, where processes and data are integrated in a service-oriented manner. It’s a much more dynamic environment (changes are more frequent and preferred) than a more static environment in an on-premise systems integration scenario
    • Thus is it correct to simply have BizTalk hosted and sell it as a cloud-based integration solution? Will an on-premise systems integration approach effectively handle integration scenarios in a more dynamic environment?

    Pure cloud pundits often ask “why not cloud?” But I think it’s also fair to counter that question with “why?” Not all IT functions and workloads are ideally suited for external deployment. A prudent architect should carefully consider what are the right things to move into the cloud, and what are the right things to still keep on-premise, instead of doing external cloud deployment just for the sake of doing so. There’s a big difference between “can” and “should”.

    One way of looking at finding the right balance between what should move into the cloud, is where the users are. Applications that are consumed by users on the Web, are excellent candidates to move into public clouds. Internal business applications that support a back-office operation, often are still better maintained on-premise; closer to an organization’s workforce. It’s also a nice general approach of balancing trade-offs between security and control, scalability and availability.

    Thus eventually Microsoft will have some form of enterprise-level identity management solution, and multi-enterprise integration solution, available as cloud-based services. But these don’t necessarily have to be hosted Active Directory and BizTalk Server as we know them today. :)

  • Architecture + Strategy

    Popfly as a Web Platform




    Microsoft Popfly (, currently in beta since October 2007, is a web site and tool to help people create and share web sites, mashups, and other kinds of experiences.

    This service, in my opinion, is a really interesting and innovative product Microsoft has delivered this year. From an architect's perspective, Popfly can be considered as a Web platform, along with the many other interesting ones created this year, such as the Facebook Platform.

    Many people also saw Popfly's potential as a Web platform. For example, Mary Jo Foley correlated it to Yahoo! Pipes, Tom Foremski described how easy it is to build a Facebook app with Popfly, John Mullinax provided a business perspective on how to leverage Popfly, and Denny Boynton with some architectural thoughts.

    A Web Platform

    In an earlier blog post I talked about "Web as a Platform" (in Web 2.0's context) and briefly described a layered and componentized perspective in looking at the Web platform in general. Popfly fits in that perspective very well, and can be categorized into a composition tools layer that doesn't seem to have received a lot of attention from the general Web 2.0 community. Specifically, in the programmable Web aspect of Web 2.0, the focus has been on creating the APIs, frameworks, runtime environments, standards, etc. to facilitate the various kinds of applications and social interactions. But the tasks of developing these applications still rely on traditional code-based environments. Popfly represents a major innovation on the composition tool side, and does it in an elegant way that transformed the bootstrapping requirements of various kinds of services and APIs available in the cloud, into, literally, building blocks that people without any technical background can piece together (like LEGO!) and create all kinds of composite applications (or mashups). It also offers a provisioning and syndication system so these applications can be deployed (or embedded into web pages) anywhere on the Web (and coined the term "mashout").

    Popfly has been compared to Yahoo! Pipes, which provides a very elegant composition tool for aggregating and manipulating syndicated content (and a wickedly cool implementation of JavaScript in its development environment). It is a very powerful platform in terms of programmability in the context of mashing up data. Another is Google Mashup Editor, which is also a very powerful tool that helps people quickly create mashup applications. Without turning this into a comparison of the three tools, in general I think each provides a distinct value and meet different needs. For example, Yahoo! Pipes provides a graphical drag-and-drop development model in using syndicated data, and Google Mashup Editor provides a code environment particularly targeted for utilizing Google services and products; though the target audience for both of them tend to be developers.

    Popfly differs in its approach to democratize development by raising the level of abstraction and narrowing down options in block configurations. This greatly simplifies the process of piecing together building blocks, and it is this simplicity that offers Popfly's greatest advantage at making development social, and potentially more appealing to a wider audience.

    The public beta provides many kinds of building blocks - display, fun & games, images & video (media), local information, maps, news & RSS, shopping, social networks, tools (programming utilities such as RegExp), and others. These building blocks represent configurable components that map to many different kinds of cloud-based service APIs, such as Flickr, Facebook, Live Search, AOL Video Search, Yahoo! Videos, Virtual Earth, Yahoo! Traffic, Digg, Yahoo! News, Twitter, Technorati, etc.; the list goes on. The rich collection (and growing) of building blocks allows not just the mashup of functions and data, but also adding an interchangeable visualization and interaction layer to the applications.

    Popfly boostrapped these cloud-based service APIs, and exposed their methods, input parameters, and results as configurable elements in each building block. In addition, Popfly also pre-defines and maintains compatible relationships between these APIs so in many cases, default configurations are sufficient for creating a mashup without requiring the user to perform any configuration changes. Simply drag and drop, and connect the dots will do.

    Popfly itself is implemented using a combination of traditional Web application technologies (ASP.NET, AJAX, JavaScript, HTML, etc.) hosted in a highly available server infrastructure, and a Silverlight implementation of the in-browser development environment.

    The challenge for Popfly is reaching critical mass in adoption. Just like the Facebook Platform, which is really a software distribution platform, harnesses its power from the lively communities in Facebook. Popfly can achieve similar goals if its adoption can be turned into a self-propelling virtuous cycle, when a healthy growth in adoption can be facilitated.

    Thus Popfly really is a platform in the Web 2.0 world. It provides an environment where people without a significant technical background can build stuff in, and hides the complexities in the underlying infrastructure. It also articulates many of the Web 2.0 principles, such as enabling participation and harnessing collective intelligence, leveraging the long tail, lightweight development models, rich user experiences, etc. For businesses and organizations looking to open up their data and services, or to interact with the user communities, participating in the Popfly ecosystem could be a simple way to enable viral adoption in the distribution channel (and for some, utilizing the monetization methods).

    A 1-Minute Mashup Application

    To illustrate Popfly's simplistic elegance, I created a mashup between a Flickr picture set and a visualization block that uses Silverlight. A snapshot of the application in edit mode is shown below.


    Without going into a detailed step-by-step replay, all I did was drag/drop the Flickr block, configure it with the Flickr set ID that contains the pictures I want to use, drag/drop the Carousel block, then drag/drop a connector from the Flickr block to the Carousel block. Hooking up the output from Flickr with input parameters in Carousel was done automatically and seamless. That's it! And the application is now ready to be deployed across the Web.

    The resulting mashup application is embedded below. I picked a presentation block that uses Silverlight, but there are blocks that are pure HTML and JavaScript too.


    Share this post :

    This post is part of a series:

  • Architecture + Strategy

    Building Highly Scalable Java Applications on Windows Azure (JavaOne 2010)


    075018_thumb6JavaOne has always been one of my favorite technology conferences, and this year I had the privilege to present a session there. Given my background in Java, previous employment at Sun Microsystems, and the work I’m currently doing with Windows Azure at Microsoft, it’s only natural to try to piece them together and find more ways to use them. Well, honestly, this also gives me an excuse to attend the conference, plus the co-located Oracle OpenWorld, along with 41,000 other attendees. ;)

    InfoQA related article published on InfoQ may also provide some context around this presentation. Plus my earlier post on getting Jetty to work in Azure -, which goes into a bit more technical detail on how a Java application can be deployed and run in Windows Azure.

    Java in Windows Azure

    So at the time of this writing, deploying and running Java in Windows Azure is conceptually analogous to launching a JVM and run a Java app from files stored on a USB flash drive (or files extracted from a zip/tar file without any installation procedures). This is primarily because Windows Azure isn’t a simple server/VM hosting environment. The Windows Azure cloud fabric provides a lot of automation and abstraction so that we don’t have to deal with server OS administration and management. For example, developers only have to upload application assets including code, data, content, policies, configuration files and service models, etc.; while the Windows Azure manages the underlying infrastructure:

    • application containers and services, distributed storage systems
    • service lifecycle, data replication and synchronization
    • server operating system, patching, monitoring, management
    • physical infrastructure, virtualization, networking
    • security
    • “fabric controller” (automated, distributed service management system)

    The benefit of this cloud fabric environment is that developers don’t have to spend time and effort managing the server infrastructure; they can focus on the application instead. However, the higher abstraction level also means we are interacting with sandboxes and containers, and there are constraints and limitations compared to the on-premise model where the server OS itself (or middleware and app server stack we install separately) is considered the platform. Some of these constraints and limitations include:

    • dynamic networking – requires interaction with the fabric to figure out the networking environment available to a running application. And as documented, at this moment, the NIO stack in Java is not supported because of its use of loopback addresses
    • no OS-level access – cannot install software packages
    • non-persistent local file system – have to persist files elsewhere, including log files and temporary and generated files

    These constraints impact Java applications because the JVM is a container itself and needs this higher level of control, whereas .NET apps can leverage the automation enabled in the container. Good news is, the Windows Azure team is working hard to deliver many enhancements to help with these issues, and interestingly, in both directions in terms of adding more higher-level abstractions as well as providing more lower-level control.

    Architecting for High Scale

    So at some point we will be able to deploy full Java EE application servers and enable clustering and stateful architectures, but for really large scale applications (at the level of Facebook ad Twitter, for example), the current recommendation is to leverage shared-nothing and stateless architectures. This is largely because, in cloud environments like Azure, the vertical scaling ceiling for physical commodity servers is not very high, and adding more nodes to a cluster architecture means we don’t get to leverage the automated management capabilities built into the cloud fabric. Plus the need to design for system failures (service resiliency) as opposed to assuming a fully-redundant hardware infrastructure as we typically do with large on-premise server environments.


    (Pictures courtesy of LEGO)

    The top-level recommendation for building a large-scale application in commodity server-based clouds is to apply more distributed computing best practices, because we’re operating in an environment with more smaller servers, as opposed to fewer bigger servers. The last part of my JavaOne presentation goes into some of those considerations. Basically - small pieces, loosely coupled. It’s not like the traditional server-side development where we’d try to get everything accomplished within the same process/memory space, per user request. Applications can scale much better if we defer (async) and/or parallelize as much work as possible; very similar to Twitter’s current architecture. So we could end up having many front-end Web roles just receiving HTTP requests, persist some data somewhere, fire off event(s) into the queue, and return a response. Then another layer of Worker roles can pick up the messages from the queue and do the rest of the work in an event-driven manner. This model works great in the cloud because we can scale the front-end Web roles independently of the back-end Worker roles, plus not having to worry about physical capacity.


    In this model, applications need to be architected with these fundamental principles:

    • Small pieces, loosely coupled
    • Distributed computing best practices
      • asynchronous processes (event-driven design)
      • parallelization
      • idempotent operations (handle duplicity)
      • de-normalized, partitioned data (sharding)
      • shared nothing architecture
      • optimistic concurrency
      • fault-tolerance by redundancy and replication
      • etc.

    Thus traditionally monolithic, sequential, and synchronous processes can be broken down to smaller, independent/autonomous, and loosely coupled components/services. As a result of the smaller footprint of processes and loosely-coupled interactions, the overall architecture will observe better system-level resource utilization (easier to handle more smaller and faster units of work), improved throughput and perceived response time, superior resiliency and fault tolerance; leading to higher scalability and availability.

    Lastly, even though this conversation advocates a different way of architecting Java applications to support high scalability and availability, the same fundamental principles apply to .NET applications as well.

Page 5 of 28 (137 items) «34567»