Architecture + Strategy

Musings from David Chou - Architect, Microsoft

  • Architecture + Strategy

    Describing Web Platform Stack



    In an earlier blog post I talked about "Web as a Platform" (in Web 2.0's context) and briefly described a layered and componentized perspective in looking at the Web platform in general. And I thought it would be more clarifying to illustrate what a Web platform stack might look like, so this post is intended to describe (not define) a stack view of the Web platform.

    Plus, the evolution of this stack is the result of collective innovation contributed by many brilliant minds, and not driven by any single entity. Just as Eric Schmidt had said, "don't fight the Internet", I also don't think we need to model the Web after a specific prescribed framework. Rather, just allow the collective consciousness continue to innovate organically.

    Thus this is a description of the Web platform (not a definition), as it is merely an attempt at categorizing the observed patterns and trends, and their relationships and dependencies, in the Web 2.0 phenomenon, into a structured context. There are many ways to describe and categorize these patterns, so this view is not necessarily exact and accurate, but hopefully it can provide some clarifications into the way Web is evolving.

    Architecture of the Web

    Below is a high-level rendering of the layered components architecture view of the Web platform stack:

    The choice of words used is questionable, but the intention here is to highlight the trends and patterns (and their relationships) and hoping to effectively convey the concepts, without spending the time to make sure they are semantically accurate.

    Just as I mentioned in my earlier blog post, layers towards the bottom of the stack are progressively closer to raw data and IT architectures, and layers towards the top are closer to people. In general, this layered stack view is used as I think lower layers serve as platforms that encapsulate the underlying complexities and provide abstraction and support to the upper layers. Even though this also kind of describes the evolutionary path (or a maturity model) of the Web in the past few years, I think this stack view is relevant as innovation is still occurring across this entire view.

    Some examples to help clarify (nowhere near a comprehensive list; just intended to illustrate the categorizations):


    • Standards - XML, HTML, CSS, SOAP, REST, Atom, RSS, BitTorrent, HTTP, SMTP, FTP, SMS, VoIP, etc.
    • Tools - LAMP, WISA, JavaScript, .NET, Java, Visual Studio, Eclipse, etc.
    • Media - video streaming, podcasts, vcasts, electronic gaming, interactive TV, Microsoft IP TV, Microsoft Media Center
    • Runtimes - hosting environment, servers, desktops, browsers, clients, mobile devices, Microsoft Xbox, Sony Playstation, Nintendo Wii, Adobe AIR, Microsoft Silverlight, etc.
    • Networks - Internet, Wi-Fi, VPN, WAN, cellular, wireless LAN, DSL, FiOS, etc.


    • Utilities - Amazon EC2, programmableweb, etc.
    • Data - Amazon S3, Google Base, Microsoft Astoria, etc.
    • Storage - Google GDrive, Windows Live Skydrive, XDrive, DriveHQ,, Elephant Drive, etc.
    • Messaging - Amazon SQS, Microsoft BizTalk Services, etc.
    • Identity - Windows Live ID, Google Accounts, Yahoo! Accounts, OpenID, etc.


    • Personalization - My Yahoo!, iGoogle, Netvibes, Windows Live, bookmarks, favorites, etc.
    • Transformation - Microsoft BizTalk Services (part of Don Ferguson's description of the "Internet Service Bus")
    • Composition - Yahoo! Pipes, Google Mashup Editor, Microsoft BizTalk Services, etc.
    • Orchestration - Microsoft BizTalk Services (part of Don Ferguson's description of the "Internet Service Bus")
    • Privacy - TBD; in general, interoperable services to give users control over what parts of their online presences to share and what not to share


    • Information - Google Analytics, Google Trends, MSN, Yahoo! News, Yahoo! Finance, Upcoming, etc.
    • Visualization - Google Maps, Virtual Earth, Yahoo! Maps, Google Gadgets, Windows Live Gadgets, Vista Sidebar Gadgets, mobile clients, etc.
    • Commerce - Amazon, eBay, Paypal, Google Checkout, MSN Shopping, Microsoft Points, etc.
    • Monetization - Google AdSense, Google AdWords, Microsoft AdCenter, pay-per-click, cost-per-action, impressions, etc.
    • Accessibility - TellMe, Google Translate, Live Search Translator, services for the visually impaired like Google Accessible Search, plusmo, ZapText, etc.


    • Search - Google Search, Yahoo! Search, Ask, Windows Live Search, etc.
    • Distribution - Facebook Platform, Microsoft Popfly, etc.
    • Aggregation - Newsgator, Bloglines, Rojo, NetNewsWire, My Yahoo!, Windows Live, iGoogle, PageFlakes, etc.
    • Syndication - Twitter, Jaiku, Pownce, Facebook Newsfeed, Feedburner, Technorati, etc.
    • Portability - Gadgets, Widgets, Google OpenSocial, etc.


    • User Content - blogs, wikis, reviews, photo sharing, Blogger, WordPress, LiveJournals, Wikipedia, CrowdRules, Flickr, Youtube. Epinions, Urban Dictionary, Trip Advisor, eHarmony, Match, etc.
    • Communities - MySpace, Facebook, Orkut, hi5, Bebo, Windows Live Spaces, Friendster, LinkedIn, World of Warcraft, Xbox Live, Second Life, etc.
    • Folksonomies -, Digg, reddit, Simpy, Furl, Netvouz, etc.
    • Collaborative Filtering - Amazon,, NetFlix, TiVO,, StumbleUpon, etc.
    • Mashups - Microsoft Popfly, JackBe, etc.

    Interaction (emerging):

    • Social Graphs - capability to enable social network analysis and moving towards mapping physical relationships
    • Collective Intelligence - capability to comprehend and extract meaning from composite/aggregate communities
    • Microformats - creation of various metadata formatting approaches to add contextual relationships
    • Semantic Relationships - adding meaning and contextual mapping to various forms of content available on the Web
    • Implicit Networks - frameworks for capturing and analyzing dynamic aspects of user activities on the Web

    Interpretation (futures):

    • Derived Intelligence - a form of artificial intelligence derived from collective intelligence to aid predictive analysis
    • User Intent - discernment of user intentions based on historical user activities, responses, and collective trends and patterns
    • Dynamic Relationships - ability to map dynamic aspects between user activities throughout the Web

    In general, each component in each layer is worthy of a separate detailed analysis, and only a minor fraction of things that are examples of a particular category have been listed. Though the intention is to show that each site or service individually is not representative of the layer component; it is the network effects created by the collection of sites and services in that category. Similarly for the layers in the platform stack, it is the aggregation of individual components that really exemplify the characteristics of that layer.

    As a result, we can see that each layer in this stack view has dependencies on the services offered in the underlying layers. And each layer itself provides a level of abstraction and support to the layers above. Thus architecting solutions using the Web as a platform is quickly becoming a process of choosing a target layer (where the solution will reside), and choosing the appropriate combination of support services from the underlying layers.

    This view can also be helpful in visualizing general trends of innovative development on the Web, and identify potential spots where opportunistic developments can occur, and areas where they have been turned over to systematic developments. For example, the current state (as of this writing) is that mainstream efforts can be categorized as focused in the "Participation" layer, and is where many of the opportunistic developments are being turned into systematic ones (basically, gaining maturity). The "Interaction" and "Interpretation" layers are considered to be the subject of the next Web (or Web 3.0), and is where much of the research & development efforts are focused in.

    One fundamental aspect is that, the Web platform is created and maintained by the collective wisdom contributing to it. It is too big and too diverse for any one entity to own, even though many organizations have investments in multiple areas, and some more than the others. But it is interesting to see how this view of the Web platform is taking shape, based on the inter-dependencies and (almost "symbiotic") relationships established between the clusters of sites and organizations operating on the Web.


    The general concept here is that, Web 2.0 applications are taking on a new form. They are composite applications in nature, and increasingly can be created and hosted completely in the Web (cloud), without any dedicated on-premise infrastructure. And they are increasingly being implemented at higher levels of abstraction (moving up the stack).

    For established enterprises, this marks a shift in Web application models. From approaches to open up enterprise data silos and providing value-added services to customers ("Applications" layer aspects), to migrate to a model where various higher-level components of the Web ("Integration" and "Participation" layer aspects) can be integrated and leveraged to connect to the communities. For emerging businesses, it is now possible to quickly establish an initial online presence by completely building on the cloud-based Web platform, while looking to add differentiating values with a variety of options (such as dedicated on-premise solutions).

    From a user participation perspective, lower layers are progressively closer to people with higher technical expertise, but are populated by smaller communities. On the other hand, upper layers are progressively closer to larger communities as barriers to entry, from a technology perspective, are increasingly lower. This aspect demonstrates the power of network effects in enabling the participation age, and fueling the explosive pace of innovation towards creating a Web that connects/involves more people and is more relevant and intelligent.

    Certainly, areas where boundaries are being pushed may still sound like science fiction, and it's fun to imagine that new breakthroughs will bring about sea changes that will overthrow all conventional wisdom. The blogosphere already has tons of speculations in that respect. Though I believe "could" does not equate to "should", such that change for change sake will not add value; only changes that lead to better outcomes will gain adoption. Thus my assessment is that, significant changes are surely imminent, but conventional wisdom will also not cease to complete irrelevance. Eventually, when the pendulum settles, we usually see a hybrid world, with some changes more dominant, and some changes less. The Web is a place where rapid changes are occurring, and as architects and strategists, using a pragmatic viewpoint when facing these changes may help us better plan the migration path between current and future states.

    Share this post :

    This post is part of a series:

  • Architecture + Strategy

    Designing for Cloud-Optimized Architecture


    I wanted to take the opportunity and talk about the cloud-optimized architecture, the implementation model instead of the popular perceptions around leveraging cloud computing as a deployment model. This is because, while cloud platforms like Windows Azure can run a variety of workloads, including many legacy/existing on-premises software and application migration scenarios that can run on Windows Server; I think Windows Azure’s platform-as-a-service model offers a few additional distinct technical advantages when we design an architecture that is optimized (or targeted) for the cloud platform.

    Cloud platforms differ from hosting providers

    First off, the major cloud platforms (regardless how we classify them as IaaS or PaaS) at the time of this writing, impose certain limitations or constraints in the environment, which makes them different from existing on-premises server environments (saving the public/private cloud debate to another time), and different from outsourced hosting managed service providers. Just to cite a few (according to my own understanding, at the time of this writing):

    • Amazon Web Services
      • EC2 instances are inherently stateless; that is, their local storage is non-persistent and non-durable
      • Little or no control over infrastructure that are used beneath the EC2 instances (of course, the benefit is we don’t have to be concerned with them)
      • Requires systems administrators to configure and maintain OS environments for applications
    • Google App Engine
      • Non-VM/OS instance-aware platform abstraction which further simplifies code deployment and scale, though some technical constraints (or requirements) as well. For example,
      • Stateless application model
      • Requires data de-normalization (although Hosted SQL will mitigate some concerns in this area)
      • If the application can't load into memory within 1 second, it might not load and return 500 error codes (from Carlos Ble)
      • No request can take more than 30 seconds to run, otherwise it is stopped
      • Read-only file system access
    • Windows Azure Platform
      • Windows Azure instances are also inherently stateless – round-robin load balancer and non-persistent local storage
      • Also due to the need to abstract infrastructure complexities, little or no control for the underlying infrastructure is offered to applications
      • SQL Azure has individual DB sizing constraints due to its 3-replica synchronization architecture

    Again, just based on my understanding, and really not trying to paint a “who’s better or worse” comparative perspective. The point is, these so-called “differences” exist because of many architectural and technical decisions and trade-offs to provide the abstractions from the underlying infrastructure. For example, the list above is representative of most common infrastructure approaches of using homogeneous, commodity hardware, and achieve performance through scale-out of the cloud environment (there’s another camp of vendors that are advocating big-machine and scale-up architectures that are more similar to existing on-premises workloads). Also, the list above may seem unfair to Google App Engine, but on the flip side of those constraints, App Engine is an environment that forces us to adopt distributed computing best practices, develop more efficient applications, have them operate in a highly abstracted cloud and can benefit from automatic scalability, without having to be concerned at all with the underlying infrastructure. Most importantly, the intention is to highlight that there are a few common themes across the list above – stateless application model, abstraction from infrastructure, etc.

    Furthermore, if we take a cloud computing perspective, instead of trying to apply the traditional on-premises architecture principles, then these are not really “limitations”, but more like “requirements” for the new cloud computing development paradigm. That is, if we approach cloud computing not from a how to run or deploy a 3rd party/open-source/packaged or custom-written software perspective, but from a how to develop against the cloud platform perspective, then we may find more feasible and effective uses of cloud platforms than traditional software migration scenarios.

    Windows Azure as an “application platform”

    Fundamentally, this is about looking at Windows Azure as a cloud platform in its entirety; not just a hosting environment for Windows Server workloads (which works too, but the focus of this article is on cloud-optimized architecture side of things). In fact, Windows Azure got its name because it is something a little different than Windows Server (at the time of this writing). And that technically, even though the Windows Azure guest VM OS is still Windows Server 2008 R2 Enterprise today, the application environment isn’t exactly the same as having your own Windows Server instances (even with the new VM Role option). And it is more about leveraging the entire Windows Azure platform, as opposed to building solely on top of the Windows Server platform.

    For example, below is my own interpretation of the platform capabilities baked into Windows Azure platform, which includes SQL Azure and Windows Azure AppFabric also as first-class citizens of the Windows Azure platform; not just Windows Azure.


    I prefer using this view because I think there is value to looking at Windows Azure platform holistically. And instead of thinking first about its compute (or hosting) capabilities in Windows Azure (where most people tend to focus on), it’s actually more effective/feasible to think first from a data and storage perspective. As ultimately, code and applications mostly follow data and storage.

    For one thing, the data and storage features in Windows Azure platform are also a little different from having our own on-premises SQL Server or file storage systems (whether distributed or local to Windows Server file systems). The Windows Azure Storage services (Table, Blob, Queue, Drive, CDN, etc.) are highly distributed applications themselves that provide a near-infinitely-scalable storage that works transparently across an entire data center. Applications just use the storage services, without needing to worry about their technical implementation and up-keeping. For example, for traditional outsourced hosting providers that don’t yet have their own distributed application storage systems, we’d still have to figure out how to implement and deploy a highly scalable and reliable storage system when deploying our software. But of course, the Windows Azure Storage services require us to use new programming interfaces and models (REST-based API’s primarily), and thus the difference with existing on-premises Windows Server environments.

    SQL Azure, similarly, is not just a plethora of hosted SQL Server instances dedicated to customers/applications. SQL Azure is actually a multi-tenant environment where each SQL Server instance can be shared among multiple databases/clients, and for reliability and data integrity purposes, each database has 3 replicas on different nodes and has an intricate data replication strategy implemented. The Inside SQL Azure article is a very interesting read for anyone who wants to dig into more details in this area.

    Besides, in most cases, a piece of software that runs in the cloud needs to interact with data (SQL or no-SQL) and/or storage in some manner. And because data and storage options in Windows Azure platform are a little different than their seeming counterparts in on-premises architectures, applications often require some changes as well (in addition to the differences in Windows Azure alone). However, if we look at these differences simply as requirements (what we have) in the cloud environment, instead of constraints/limits (what we don’t have) compared to on-premises environments, then it will take us down the path to build cloud-optimized applications, even though it might rule out a few application scenarios as well. And the benefit is that, by leveraging the platform components as they are, we don’t have to invest in the engineering efforts to architect and build and deploy highly reliable and scalable data management and storage systems (e.g., build and maintain your own implementations of Cassandra, MongoDB, CouchDB, MySQL, memcarche, etc.) to support applications; we can just use them as native services in the platform.

    The platform approach allows us to focus our efforts on designing and developing the application to meet business requirements and improve user experience, by abstracting away the technical infrastructure for data and storage services (and many other interesting ones in AppFabric such as Service Bus and Access Control), and system-level administration and management requirements. Plus, this approach aligns better with the primary benefits of cloud computing – agility and simplified development (less cost as a result).

    Smaller pieces, loosely coupled

    Building for the cloud platform means designing for cloud-optimized architectures. And because the cloud platforms are a little different from traditional on-premises server platforms, this results in a new developmental paradigm. I previously touched on this topic with my presentation at JavaOne 2010, then later on at Cloud Computing Expo 2010 Santa Clara; just adding some more thoughts here. To clarify, this approach is more relevant to the current class of “public cloud” platform providers such as ones identified earlier in this article, as they all employ the use of heterogeneous and commodity servers, and with one of the goals being to greatly simplify and automate deployment, scaling, and management tasks.

    Fundamentally, cloud-optimized architecture is one that favors smaller and loosely coupled components in a highly distributed systems environment, more than the traditional monolithic, accomplish-more-within-the-same-memory-or-process-or-transaction-space application approach. This is not just because, from a cost perspective, running 1000 hours worth of processing in one VM is relatively the same as running one hour each in 1000 VM’s in cloud platforms (although the cost differential is far greater between 1 server and 1000 servers in an on-premises environment). But also, with a similar cost, that one unit of work can be accomplished in approximately one hour (in parallel), as opposed to ~1000 hours (sequentially). In addition, the resulting “smaller pieces, loosely coupled” architecture can scale more effectively and seamlessly than a traditional scale-up architecture (and usually costs less too). Thus, there are some distinct benefits we can gain, by architecting a solution for the cloud (lots of small units of work running on thousands of servers), as opposed to trying to do the same thing we do in on-premises environments (fewer larger transactions running on a few large servers in HA configurations).

    I like using the LEGO analogy below. From this perspective, the “small pieces, loosely coupled” fundamental design principle is sort of like building LEGO sets. To build bigger sets (from a scaling perspective), with LEGO we’d simply use more of the same pieces, as opposed to trying to use bigger pieces. And of course, the same pieces can allow us to scale down the solution as well (and not having to glue LEGO pieces together means they’re loosely coupled). Winking smile

    But this architecture also has some distinct impacts to the way we develop applications. For example, a set of distributed computing best practices emerge:

    • asynchronous processes (event-driven design)
    • parallelization
    • idempotent operations (handle duplicity)
    • de-normalized, partitioned data (sharding)
    • shared nothing architecture
    • fault-tolerance by redundancy and replication
    • etc.

    Asynchronous, event-driven design – This approach advocates off-loading as much work from user requests as possible. For example, many applications just simply incur the work to validate/store the incoming data and record it as an occurrence of an event and return immediately. In essence it’s about divvying up the work that makes up one unit of work in a traditional monolithic architecture, as much as possible, so that each component only accomplishes what is minimally and logically required. Rest of the end-to-end business tasks and processes can then be off-loaded to other threads, which in cloud platforms, can be distributed processes that run on other servers. This results in a more even distribution of load and better utilization of system resources (plus improved perceived performance from a user’s perspective), thus enabling simpler scale-out scenarios as additional processing nodes and instances can be simply added to (or removed from) the overall architecture without any complicated management overhead. This is nothing new, of course; many applications that leverage Web-oriented architectures (WOA), such as Facebook, Twitter, etc., have applied this pattern for a long time in practice. Lastly, of course, this also aligns well to the common stateless “requirement” in the current class of cloud platforms.

    Parallelization – Once the architecture is running in smaller and loosely coupled pieces, we can leverage parallelization of processes to further improve the performance and throughput of the resulting system architecture. Again, this wasn’t so prevalent in traditional on-premises environments because creating 1000 additional threads on the same physical server doesn’t get us that much more performance boost when it is already bearing a lot of traffic (even on really big machines). But in cloud platforms, this can mean running the processes in 1000 additional servers, and for some processes this would result in very significant differences. Google’s Web search infrastructure is a great example of this pattern; it is publicized that each search query gets parallelized to the degree of ~500 distributed processes, then the individual results get pieced together by the search rank algorithms and presented to the user. But of course, this also aligns to the de-normalized data “requirement” in the current class of cloud platforms, as well as SQL Azure’s implementation that resulted in some sizing constraints and the consequent best practice of partitioning databases, because parallelized processes can map to database shards and try not to significantly increase the concurrency levels on individual databases, which can still degrade overall performance.

    Idempotent operations – Now that we can run in a distributed but stateless environment, we need to make sure that same process that gets routed to multiple servers don’t result in multiple logical transactions or business state changes. There are processes that could and prefer duplicate transactions, such as ad clicks; but there are also processes that don’t want multiple requests be handled as duplicates. But the stateless (and round-robin load-balancing in Windows Azure) nature of cloud platforms requires us to put more thoughts into scenarios such as when a user manages to send multiple submits from a shopping cart, as these requests would get routed to different servers (as opposed to stateful architectures where they’d get routed back to the same server with sticky sessions) and each server wouldn’t know about the existence of the process on the other server(s). There is no easy way around this, as the application ultimately needs to know how to handle conflicts due to concurrency. Most common approach is to implement some sort of transaction ID that uniquely identifies the unit of work (as opposed to simply relying on user context), then choose between last-writer or first-writer wins, or optimistic locking (though any form of locking would start to reduce the effectiveness of the overall architecture).

    De-normalized, partitioned data (sharding) – Many people perceive the sizing constraints in SQL Azure (currently at 50GB – also note it’s the DB size and not the actual file size which may contain other related content) as a major limitation in Windows Azure platform. However, if a project’s data can be de-normalized to a certain degree, and partitioned/sharded out, then it may fit well into SQL Azure and benefit from the simplicity, scalability, and reliability of the service. The resulting “smaller” databases actually can promote the use of parallelized processes, perform better (load more distributed than centralized), and improve overall reliability of the architecture (one DB failing is only a part of the overall architecture, for example).

    Shared nothing architecture – This means a distributed computing architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. With data sharding and maintained in many distributed nodes, the application itself can and should be developed using shared-nothing principles. But of course, many applications need access to shared resources. It is then a matter of deciding whether a particular resource needs to be shared for read or write access, and different strategies can be implemented on top of a shared nothing architecture to facilitate them, but mostly as exceptions to the overall architecture.

    Fault-tolerance by redundancy and replication – This is also “design for failures” as referred to many cloud computing experts. Because of the use of commodity servers in these cloud platform environments, system failures are a common thing (hardware failures occur almost constantly in massive data centers) and we need to make sure we design the application to withstand system failures. Similar to thoughts around idempotency above, designing for failures basically means allowing requests to be processed again; “try-again” essentially.

    Lastly, each of the topic areas above is worthy of an individual article and detailed analysis; and lots of content are available on the Web that provide a lot more insight. The point here is, each of the principles above actually has some relationship with, and dependency on, the others. It is the combination of these principles that contribute to an effective distributed computing, and cloud-optimized architecture.

  • Architecture + Strategy

    Rise of the Cloud Ecosystems


    I had the opportunity to participate at this year’s CloudConnect conference, with my session on Building Highly Scalable Applications on Windows Azure, which is mostly an update from my earlier presentations at JavaOne and Cloud Computing Expo. I was pleased to learn that the cloud-optimized design leveraging distributed computing best practices approach, aligned well to similar talks by well-known cloud experts from Amazon, Google, etc. A more detailed discussion on this topic can be found in my earlier post - Designing for Cloud-Optimized Architecture.

    One of the major takeaways I had from the conference, was the focus on ‘cloud platforms’ (or platform-as-a-service generally) messaging, further reinforcing the platform view of cloud computing, as opposed to the infrastructure-level perspectives, or mixed views around the popular SaaS, PaaS, and IaaS service delivery models.

    Werner Vogels,  Vice President and CTO, Amazon.comIt started with Werner Vogels in his keynote presentation. Werner said, “it is all about the cloud ecosystem”, that “everything are cloud services; everything as a cloud service”, and “not constrained by any model”. And “let a thousand platforms bloom”, where “ecosystems grow as big as possible”. This implies that the popular models (e.g., SaaS, PaaS, IaaS) are irrelevant, because everything are simply services that we can consume, and these services can span the entire spectrum of IT capabilities as we know, and potentially more.

    Infrastructure as a serviceIt is interesting to see how the platform messaging evolved over the past few years. For instance, Werner Vogels used to refer to Amazon Web Services as “Infrastructure as a Service”. However, I think Werner Vogels has always advocated the platform view, similar to Gartner’s notion of “application infrastructure as a service”, instead of the overused IaaS (infrastructure-as-a-service) service delivery model (as popularized by NIST’s definition of cloud computing). Perhaps, it was also because many people started incorrectly referring to Amazon Web Services as IaaS and not seeing the platform view, that Werner chose to clarify that models are irrelevant and “it is all about the cloud ecosystem”.

    I also belong to the camp that advocates the platform view, and the further ecosystem view of cloud computing. The platform and ecosystem views of cloud computing represent a new paradigm, and promote a new way of computing. Though I think the SaaS, PaaS, and IaaS classifications (or service delivery models) still have some uses too. They are particularly relevant when trying to understand the general differences and trade-offs between the service delivery models (as defined by NIST), from a layers and levels of abstractions perspective.


    Perhaps, what we shouldn’t do, is to try to fit cloud service providers into these categories/models. As often, a particular service provider may have offerings in multiple models, have offerings that don’t fit well in these models, or it’d be over-simplifying to refer to cloud platforms, like Amazon Web Services, Google App Engine, Windows Azure platform, etc., strictly in this platform-as-a-service definition. These platforms have a lot more to offer than simply a higher-level abstraction compute service.

    Cloud Platforms

    As Werner Vogels said, “cloud is actually a very large collection of services”, cloud platforms aren’t just a place to deploy and execute workloads. Cloud platforms provide the necessary capabilities, delivered as distinct services, that applications can leverage to accomplish specific tasks.

    Amazon Web Services has always been a cloud platform; today it is a collection of services that provide capabilities for compute (EC2, EMR), content delivery (CloudFront), database (SimpleDB, RDS), deployment and management (Elastic Beanstalk, CloudFormation), e-commerce (FWS), messaging (SQS, SNS, SES), monitoring (CloudWatch), networking (VPC, ELB), payments and billing (FPS, DevPay), storage (S3, EBS), support, etc. It is not just a hosting environment for virtual machines (which the popular IaaS model is more aligned with). In fact Amazon Web Services released S3 into production (March 2006) before EC2 (limited public beta in August 2006, removed beta label in October 2008).

    Similarly, Microsoft has been using the platform-as-a-service messaging when describing Windows Azure platform, but it is also about the collection of various capabilities (delivered as services). For example, below is a visual representation of the application platform capabilities in Windows Azure platform that I have been using since 2009 (though the list of capabilities grew over that period):


    And below shows how those capabilities are delivered as services and organized in Windows Azure platform.


    This is important because the platform view is one of the major differentiators of cloud platforms when compared to the more conventional outsourced hosting providers. When building applications using hosting providers (or strictly infrastructure-as-a-service offerings), we have to incur the engineering efforts to design, implement, and maintain our own solutions for storage, data management, security, caching, etc. In cloud platforms such as Amazon Web Services, Google App Engine, and Windows Azure platform, these capabilities are baked into the platform and available as services that are readily accessible. Applications just need to use them, without having to be concerned with any of their underpinnings and operations.

    An analogy can be drawn from high-level differences between getting food products from Costco to put together a semi-homemade meal, versus getting raw ingredients from a supermarket and prepare, cook, and finish a fully-homemade meal. :) The semi-homemade model offers higher agility (less time and efforts required to put together a meal and to scale it for bigger parties) and economy of scale in Costco’s case, while the fully-homemade model offers more control.

    Another distinction between cloud platforms and typical IaaS offerings, is that cloud platforms are more of a way of computing – a new/different paradigm; whereas IaaS offerings are better aligned towards hosting scenarios. This doesn’t mean that there are no overlaps between the two models, or that one is necessarily better than the other. They are just different models; with some overlaps, but ideally suited for different use cases. For cloud platforms, ideal use cases are aligned to net-new, or greenfield development projects that are cloud-optimized. Again, hosting scenarios also work on cloud platforms, but cloud-optimized applications stand to gain more benefits from cloud platforms.

    Cloud Ecosystems

    The cloud ecosystem view takes the cloud platform view one step further, and includes partners and third parties that enable their services to participate in an ecosystem. The collective set of capabilities from multiple organizations and potentially services spanning multiple platforms and cloud environments together form an ecosystem that feeds and builds upon each other (in composite, federated, application models), and generating best practices and reusable processes, communities, etc. This can also be viewed as a natural evolution of platform paradigms, when drawing inference from other models where the iterations typically evolved from technology maturity, critical mass in adoption, and then building ecosystems. The platform with the largest and most diverse ecosystem, gets to ride the paradigm shift and enjoy a dominant position for that particular generation.

    The Web platform stack model I discussed back in 2007 is one way of looking at the ecosystem model (apologies for the rich color scheme; I was going through a coloring phase at the time).

    In essence, a cloud ecosystem itself will likely have many layers of abstractions as well; one building on top of another. One future trend in cloud computing may very well be the continued climb into higher levels of abstraction, as differences and complexities at one level often represent development opportunities (e.g., for specializations, consolidations/aggregations, derivations, etc.) at a higher level.

    Ultimately, cloud platforms enable the dynamic environments that support the construction of ecosystems. This is one aspect inherent in cloud platforms, but not as much for lower-level IaaS environments. And as the ecosystems grow in size and diversity, the network effect (as discussed briefly back in 2008) will contribute to increasingly intelligent and interactive environments, and generate, collectively, tremendous value.

  • Architecture + Strategy

    SOA Security - Enterprise Architecture Perspective


    This week I had the opportunity to speak at the IT Architect Regional Conference in San Diego, on the subject of architecting enterprise SOA security. It is an interesting event, with speakers from Microsoft, IBM, Oracle, TIBCO, Fair Issac, and many other organizations. We even gave away a brand new XBox 360 and a Zune!

    In a nutshell, my presentation was intended to point out the security aspects of planning an enterprise SOA, and a few topics that don't seem to be covered very often, and with an emphasis towards the future and navigating the organizational and cultural issues.

    A brief overview -


    Basically, some of the fundamental changes in SOA, such as:

    • Moving from low-volume batch-oriented data replication architectures to highly interactive real-time architectures between connected systems
    • Plus the migration towards Event-Driven Architectures (EDA) means an exponential growth in real-time (though asynchronous) communication, as each event can potentially trigger off a number of downstream events which can trigger off more events being sent across the network
    • All this moves the security concerns from the traditionally isolated infrastructure and application groups, into the integration layer that becomes a cross-cutting concern for everyone involved
    • SOA can also magnify existing issues such as identity management (or the lack of), and create new issues such as exposing mainframes directly to web traffic (for sake of real-time access into legacy applications and data)
    • The ideal state of "everything talking to everything in real-time" also means a breakdown of traditional physical network zones/perimeters, where DMZ becomes more like a reception/lobby area instead of a quarantine area, and data centers can no longer be considered locked down
    • Lastly, the threat environment has also evolved from single PC attacks, to DoS system attacks, and to today's application and data-level attacks, with lowered complexity and lowered barrier of entry (facilitated by vastly improved competencies in using XML)

    Then of course, these changes also bring along many questions. Particularly many that represent conflicting approaches and each organization may come up with different solutions based on varying trade-offs.


    For example,

    • Trust vs. impersonation/delegation. There are many security groups that believe enterprise network environments are inherently unsafe (which is agreeable), and thus all systems will need to require end-user authentication (regardless whether they are user-facing or intermediaries or downstream producer systems), and that "trust" cannot be trusted
    • From a different perspective, this debate is also centered on the concept of implementing end-to-end vs. peer-to-peer security contexts
    • There is also a lot of recent discussion on moving security intelligence (w/ centralized management) into the endpoints (laptops, mobile devices, etc.), or moving intelligence into the network (like recent advances in NAC)

    In my opinion, trust-based architectures are much more flexible and scalable, and implementable by today's technology standards. And we couldn't completely eliminate trust in an impersonation/delegation model anyway. For example, a connected node/system has to "trust"  service wrappers, agents, and/or local system components to verify user credentials against a centralized repository (such as Active Directory, LDAP, etc.) anyway.

    On the other hand, having end-to-end security contexts is indeed conceptually more secure, as it can help better address the man-in-the-middle attacks, but in an SOA with a number of intermediaries between consumers and producers, there is still not an effective solution in managing public keys to support end-to-end message-level data encryption.


    It's always interesting to try to take a peek at what may be possible in the future.

    • Most SOA discussions still seem to be focused on implementing "SOA in the enterprise". While that is very important, as enterprise architects we should also start to look at the growing trend of becoming more open on the Web, to an environment where enterprises essentially have no physical perimeters and security zones, largely due to the increasing number of direct and real-time connections into an enterprise (for sake of facilitating transactions with business partners).
    • Plus at that time we would also need to be concerned with the connections going from inside an enterprise out to the Web, as more and more internal systems becoming service consumers themselves
    • Thus a potential trend is moving away from trying to secure one large environment for the entire enterprise, migrating to a model where numerous (and potentially overlapping) smaller logical partitions (or zones) can be implemented to be provisioned with more targeted and effective security solutions (depending on data sensitivity). Rationale behind this is that it'll be more effective to try to protect smaller attack surfaces, even from a systems architecture perspective
    • Another interesting trend already underway is the growing centralization of data and content. Instead of consolidating everything into one or a few large enterprise content management deployments, organizations are creating smaller islands of data and content using collaboration platforms such as SharePoint. The point here is moving from mass distribution of data and content, and smaller islands seem to be lower hanging fruits at this point


    Finally, some overall talking points. One important and interesting point that was kind of new to many people is that security in SOA has to be planned and designed just like another process layer. If we overlook security and not plan it carefully, we may end up creating tightly coupled elements in the overall architecture, and impacting the agility we intended to create.

    The most visible example of this is trying to implement message-level encryption for the sake of data integrity (message digests) and confidentiality. In order to establish an end-to-end security context (so that intermediaries, including the ESB, should not be able to decrypt sensitive data on transit to the destination), both the intended consumer and producer have to know exactly how to encrypt and decrypt data. And that depends on a previous exchange of public keys, which in this case had to occur directly between the consumer and producer endpoints. That in a way is tight coupling, as the consumer and producer endpoints have to know about each other, and are required to establish a one-to-one, peer-to-peer relationship in terms of public keys exchange used for encryption/decryption. To alleviate the situation, a centralized public key infrastructure can be implemented in an enterprise so that the management and decisions on public key usage can be externalized from endpoints and centralized. However, enterprise solutions in this area are still evolving, and we haven't yet seen effective solutions for doing similar things beyond the enterprise and on the Web.

    Lastly, the most important point is that, just like SOA governance, security is also a huge factor of the organization and corporate culture. We have to take a process-first approach to the problem (instead of technology-first), then weave in the technology delivery part of it.

    For those interested, the entire slide deck I used can be downloaded from my Windows Live SkyDrive. If you don't have Office 2007, you can download the free PowerPoint Viewer 2007.

    Share this post :
  • Architecture + Strategy

    Embedded Database Engine for Silverlight Applications


    So far, rich Internet applications (RIA) don’t have to be concerned with data management on the client side, as most connected implementations are designed to be simply a visualization and interaction layer to data and services on the server side. In these cases, data is retrieved from the server as JSON or XML data types (or serialized binary objects for some platforms), then cached in-memory as objects and collections, part of the application, and eventually discarded as part of the application lifecycle.

    But for relatively more complex applications that require larger and more distributed data sets, lower application latency, reduced chattiness, data relationships (even basic ones), the implementation starts to get a little complicated, and becomes more difficult with procedural languages like JavaScript. In that case, a simple and light-weight in-memory embedded database may become quite useful.

    Perst for .NET

    McObjectMcObject’s Perst is an open source, object-oriented embedded database system. Perst has been available in Java (SE, ME, and EE) and in .NET and .NET Compact Framework, and Mono. But recently McObjects ported Perst to Silverlight as well. Its small library size (~1MB including support for generics; ~5,000 lines of code) and small in-memory footprint is packed with a very sophisticated set of features, such as:

    • Support for direct and transparent object persistence
    • Full text search with indexing support
    • Full-ACID transactions support
    • SQL-based interface
    • XML import/export

    Perst inherited much of its heritage from McObject’s eXtremeDB embedded database product, which has been used in devices such as MP3 players, WiMAX base stations, digital TV set top boxes, military and aerospace applications, etc. Yes, even embedded applications and firmware can use databases.

    Interestingly, the C# code tree was initially produced from the Java version using the Java-to-C# converted in Visual Studio, then some additional changes to enable specific C# features. While not all Java applications can be converted this easily, this shows that some can. And this also means how similar C# and Java are.

    As an embedded database engine, Perst presents a very viable in-memory data management solution to Silverlight applications. It can be instantiated and used directly as part of the application, instead of something that runs in a separate process and memory space, or over the network on the back-end server for most RIA’s. Perst can also use Silverlight’s isolated storage to manage additional data that can be cached in a persistent manner. This enables many more complex scenarios than simply storing JSON, XML, CSV, or plain text data in isolated storage.

    Silverlight 3 Out-Of-Browser

    One can argue that the “connected client” design (as described above) works best for RIA’s, as they’re rich Web applications intended to enhance the user experience when connected to the cloud, to begin with. Or essentially, everyone can assume ubiquitous connectivity anytime and everywhere, so applications only need to live in the cloud, then all we need is a browser.

    But we’re also starting to see very valid scenarios emerge to access the cloud outside of the browser. And applications in these scenarios are often more refined/specialized than their HTML-based counterparts. For example, iPhone apps, iTunes, Wii, Adobe AIR, Google Gears, Google Desktop, Apple Dashboard Widgets, Yahoo! Gadgets, Windows Gadgets, FireFox extensions and plug-ins, Internet Explorer 8 WebSlices, SideBars, and ToolBars, etc. These out-of-browser application models are very valid because it has been observed that users tend to spend more time with them, than the websites behind those clients.

    There are many potential reasons for this observation, with some more prevalent in some scenarios. For example, local content is more visible to users and easier to access, without having to open a browser and then finding the website (sometimes needing to re-authenticate) and downloading the content. And of course, content and data can be stored locally for off-line access.

    Now that Silverlight 3 also supports installation of applications locally on the desktop, we can add the option of caching some data locally, in addition to being a locally installed visualization and interaction layer to cloud-based services, or a rich disconnected gadget. And McObject’s Perst embedded database would again be a really compelling solution to help with managing the data.

    This is especially important for out-of-browser and locally installed applications, as one of their primary benefits is to support off-line usage. Having a database management solution such as Perst can help increase the overall robustness of the user experience, by ensuring work continuity regardless of connection state.

    About McObject

    Founded by embedded database and real-time systems experts, McObject offers proven data management technology that makes applications and devices smarter, more reliable and more cost-effective to develop and maintain. McObject counts among its customers industry leaders such as Chrysler, Maximizer Software, Siemens, Phillips, EADS, JVC, Tyco Thermal Controls, F5 Networks, DIRECTV, CA, Motorola and Boeing.

Page 2 of 28 (137 items) 12345»