Architecture + Strategy

Musings from David Chou - Architect, Microsoft

  • Architecture + Strategy

    Designing for Cloud-Optimized Architecture


    I wanted to take the opportunity and talk about the cloud-optimized architecture, the implementation model instead of the popular perceptions around leveraging cloud computing as a deployment model. This is because, while cloud platforms like Windows Azure can run a variety of workloads, including many legacy/existing on-premises software and application migration scenarios that can run on Windows Server; I think Windows Azure’s platform-as-a-service model offers a few additional distinct technical advantages when we design an architecture that is optimized (or targeted) for the cloud platform.

    Cloud platforms differ from hosting providers

    First off, the major cloud platforms (regardless how we classify them as IaaS or PaaS) at the time of this writing, impose certain limitations or constraints in the environment, which makes them different from existing on-premises server environments (saving the public/private cloud debate to another time), and different from outsourced hosting managed service providers. Just to cite a few (according to my own understanding, at the time of this writing):

    • Amazon Web Services
      • EC2 instances are inherently stateless; that is, their local storage is non-persistent and non-durable
      • Little or no control over infrastructure that are used beneath the EC2 instances (of course, the benefit is we don’t have to be concerned with them)
      • Requires systems administrators to configure and maintain OS environments for applications
    • Google App Engine
      • Non-VM/OS instance-aware platform abstraction which further simplifies code deployment and scale, though some technical constraints (or requirements) as well. For example,
      • Stateless application model
      • Requires data de-normalization (although Hosted SQL will mitigate some concerns in this area)
      • If the application can't load into memory within 1 second, it might not load and return 500 error codes (from Carlos Ble)
      • No request can take more than 30 seconds to run, otherwise it is stopped
      • Read-only file system access
    • Windows Azure Platform
      • Windows Azure instances are also inherently stateless – round-robin load balancer and non-persistent local storage
      • Also due to the need to abstract infrastructure complexities, little or no control for the underlying infrastructure is offered to applications
      • SQL Azure has individual DB sizing constraints due to its 3-replica synchronization architecture

    Again, just based on my understanding, and really not trying to paint a “who’s better or worse” comparative perspective. The point is, these so-called “differences” exist because of many architectural and technical decisions and trade-offs to provide the abstractions from the underlying infrastructure. For example, the list above is representative of most common infrastructure approaches of using homogeneous, commodity hardware, and achieve performance through scale-out of the cloud environment (there’s another camp of vendors that are advocating big-machine and scale-up architectures that are more similar to existing on-premises workloads). Also, the list above may seem unfair to Google App Engine, but on the flip side of those constraints, App Engine is an environment that forces us to adopt distributed computing best practices, develop more efficient applications, have them operate in a highly abstracted cloud and can benefit from automatic scalability, without having to be concerned at all with the underlying infrastructure. Most importantly, the intention is to highlight that there are a few common themes across the list above – stateless application model, abstraction from infrastructure, etc.

    Furthermore, if we take a cloud computing perspective, instead of trying to apply the traditional on-premises architecture principles, then these are not really “limitations”, but more like “requirements” for the new cloud computing development paradigm. That is, if we approach cloud computing not from a how to run or deploy a 3rd party/open-source/packaged or custom-written software perspective, but from a how to develop against the cloud platform perspective, then we may find more feasible and effective uses of cloud platforms than traditional software migration scenarios.

    Windows Azure as an “application platform”

    Fundamentally, this is about looking at Windows Azure as a cloud platform in its entirety; not just a hosting environment for Windows Server workloads (which works too, but the focus of this article is on cloud-optimized architecture side of things). In fact, Windows Azure got its name because it is something a little different than Windows Server (at the time of this writing). And that technically, even though the Windows Azure guest VM OS is still Windows Server 2008 R2 Enterprise today, the application environment isn’t exactly the same as having your own Windows Server instances (even with the new VM Role option). And it is more about leveraging the entire Windows Azure platform, as opposed to building solely on top of the Windows Server platform.

    For example, below is my own interpretation of the platform capabilities baked into Windows Azure platform, which includes SQL Azure and Windows Azure AppFabric also as first-class citizens of the Windows Azure platform; not just Windows Azure.


    I prefer using this view because I think there is value to looking at Windows Azure platform holistically. And instead of thinking first about its compute (or hosting) capabilities in Windows Azure (where most people tend to focus on), it’s actually more effective/feasible to think first from a data and storage perspective. As ultimately, code and applications mostly follow data and storage.

    For one thing, the data and storage features in Windows Azure platform are also a little different from having our own on-premises SQL Server or file storage systems (whether distributed or local to Windows Server file systems). The Windows Azure Storage services (Table, Blob, Queue, Drive, CDN, etc.) are highly distributed applications themselves that provide a near-infinitely-scalable storage that works transparently across an entire data center. Applications just use the storage services, without needing to worry about their technical implementation and up-keeping. For example, for traditional outsourced hosting providers that don’t yet have their own distributed application storage systems, we’d still have to figure out how to implement and deploy a highly scalable and reliable storage system when deploying our software. But of course, the Windows Azure Storage services require us to use new programming interfaces and models (REST-based API’s primarily), and thus the difference with existing on-premises Windows Server environments.

    SQL Azure, similarly, is not just a plethora of hosted SQL Server instances dedicated to customers/applications. SQL Azure is actually a multi-tenant environment where each SQL Server instance can be shared among multiple databases/clients, and for reliability and data integrity purposes, each database has 3 replicas on different nodes and has an intricate data replication strategy implemented. The Inside SQL Azure article is a very interesting read for anyone who wants to dig into more details in this area.

    Besides, in most cases, a piece of software that runs in the cloud needs to interact with data (SQL or no-SQL) and/or storage in some manner. And because data and storage options in Windows Azure platform are a little different than their seeming counterparts in on-premises architectures, applications often require some changes as well (in addition to the differences in Windows Azure alone). However, if we look at these differences simply as requirements (what we have) in the cloud environment, instead of constraints/limits (what we don’t have) compared to on-premises environments, then it will take us down the path to build cloud-optimized applications, even though it might rule out a few application scenarios as well. And the benefit is that, by leveraging the platform components as they are, we don’t have to invest in the engineering efforts to architect and build and deploy highly reliable and scalable data management and storage systems (e.g., build and maintain your own implementations of Cassandra, MongoDB, CouchDB, MySQL, memcarche, etc.) to support applications; we can just use them as native services in the platform.

    The platform approach allows us to focus our efforts on designing and developing the application to meet business requirements and improve user experience, by abstracting away the technical infrastructure for data and storage services (and many other interesting ones in AppFabric such as Service Bus and Access Control), and system-level administration and management requirements. Plus, this approach aligns better with the primary benefits of cloud computing – agility and simplified development (less cost as a result).

    Smaller pieces, loosely coupled

    Building for the cloud platform means designing for cloud-optimized architectures. And because the cloud platforms are a little different from traditional on-premises server platforms, this results in a new developmental paradigm. I previously touched on this topic with my presentation at JavaOne 2010, then later on at Cloud Computing Expo 2010 Santa Clara; just adding some more thoughts here. To clarify, this approach is more relevant to the current class of “public cloud” platform providers such as ones identified earlier in this article, as they all employ the use of heterogeneous and commodity servers, and with one of the goals being to greatly simplify and automate deployment, scaling, and management tasks.

    Fundamentally, cloud-optimized architecture is one that favors smaller and loosely coupled components in a highly distributed systems environment, more than the traditional monolithic, accomplish-more-within-the-same-memory-or-process-or-transaction-space application approach. This is not just because, from a cost perspective, running 1000 hours worth of processing in one VM is relatively the same as running one hour each in 1000 VM’s in cloud platforms (although the cost differential is far greater between 1 server and 1000 servers in an on-premises environment). But also, with a similar cost, that one unit of work can be accomplished in approximately one hour (in parallel), as opposed to ~1000 hours (sequentially). In addition, the resulting “smaller pieces, loosely coupled” architecture can scale more effectively and seamlessly than a traditional scale-up architecture (and usually costs less too). Thus, there are some distinct benefits we can gain, by architecting a solution for the cloud (lots of small units of work running on thousands of servers), as opposed to trying to do the same thing we do in on-premises environments (fewer larger transactions running on a few large servers in HA configurations).

    I like using the LEGO analogy below. From this perspective, the “small pieces, loosely coupled” fundamental design principle is sort of like building LEGO sets. To build bigger sets (from a scaling perspective), with LEGO we’d simply use more of the same pieces, as opposed to trying to use bigger pieces. And of course, the same pieces can allow us to scale down the solution as well (and not having to glue LEGO pieces together means they’re loosely coupled). Winking smile

    But this architecture also has some distinct impacts to the way we develop applications. For example, a set of distributed computing best practices emerge:

    • asynchronous processes (event-driven design)
    • parallelization
    • idempotent operations (handle duplicity)
    • de-normalized, partitioned data (sharding)
    • shared nothing architecture
    • fault-tolerance by redundancy and replication
    • etc.

    Asynchronous, event-driven design – This approach advocates off-loading as much work from user requests as possible. For example, many applications just simply incur the work to validate/store the incoming data and record it as an occurrence of an event and return immediately. In essence it’s about divvying up the work that makes up one unit of work in a traditional monolithic architecture, as much as possible, so that each component only accomplishes what is minimally and logically required. Rest of the end-to-end business tasks and processes can then be off-loaded to other threads, which in cloud platforms, can be distributed processes that run on other servers. This results in a more even distribution of load and better utilization of system resources (plus improved perceived performance from a user’s perspective), thus enabling simpler scale-out scenarios as additional processing nodes and instances can be simply added to (or removed from) the overall architecture without any complicated management overhead. This is nothing new, of course; many applications that leverage Web-oriented architectures (WOA), such as Facebook, Twitter, etc., have applied this pattern for a long time in practice. Lastly, of course, this also aligns well to the common stateless “requirement” in the current class of cloud platforms.

    Parallelization – Once the architecture is running in smaller and loosely coupled pieces, we can leverage parallelization of processes to further improve the performance and throughput of the resulting system architecture. Again, this wasn’t so prevalent in traditional on-premises environments because creating 1000 additional threads on the same physical server doesn’t get us that much more performance boost when it is already bearing a lot of traffic (even on really big machines). But in cloud platforms, this can mean running the processes in 1000 additional servers, and for some processes this would result in very significant differences. Google’s Web search infrastructure is a great example of this pattern; it is publicized that each search query gets parallelized to the degree of ~500 distributed processes, then the individual results get pieced together by the search rank algorithms and presented to the user. But of course, this also aligns to the de-normalized data “requirement” in the current class of cloud platforms, as well as SQL Azure’s implementation that resulted in some sizing constraints and the consequent best practice of partitioning databases, because parallelized processes can map to database shards and try not to significantly increase the concurrency levels on individual databases, which can still degrade overall performance.

    Idempotent operations – Now that we can run in a distributed but stateless environment, we need to make sure that same process that gets routed to multiple servers don’t result in multiple logical transactions or business state changes. There are processes that could and prefer duplicate transactions, such as ad clicks; but there are also processes that don’t want multiple requests be handled as duplicates. But the stateless (and round-robin load-balancing in Windows Azure) nature of cloud platforms requires us to put more thoughts into scenarios such as when a user manages to send multiple submits from a shopping cart, as these requests would get routed to different servers (as opposed to stateful architectures where they’d get routed back to the same server with sticky sessions) and each server wouldn’t know about the existence of the process on the other server(s). There is no easy way around this, as the application ultimately needs to know how to handle conflicts due to concurrency. Most common approach is to implement some sort of transaction ID that uniquely identifies the unit of work (as opposed to simply relying on user context), then choose between last-writer or first-writer wins, or optimistic locking (though any form of locking would start to reduce the effectiveness of the overall architecture).

    De-normalized, partitioned data (sharding) – Many people perceive the sizing constraints in SQL Azure (currently at 50GB – also note it’s the DB size and not the actual file size which may contain other related content) as a major limitation in Windows Azure platform. However, if a project’s data can be de-normalized to a certain degree, and partitioned/sharded out, then it may fit well into SQL Azure and benefit from the simplicity, scalability, and reliability of the service. The resulting “smaller” databases actually can promote the use of parallelized processes, perform better (load more distributed than centralized), and improve overall reliability of the architecture (one DB failing is only a part of the overall architecture, for example).

    Shared nothing architecture – This means a distributed computing architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. With data sharding and maintained in many distributed nodes, the application itself can and should be developed using shared-nothing principles. But of course, many applications need access to shared resources. It is then a matter of deciding whether a particular resource needs to be shared for read or write access, and different strategies can be implemented on top of a shared nothing architecture to facilitate them, but mostly as exceptions to the overall architecture.

    Fault-tolerance by redundancy and replication – This is also “design for failures” as referred to many cloud computing experts. Because of the use of commodity servers in these cloud platform environments, system failures are a common thing (hardware failures occur almost constantly in massive data centers) and we need to make sure we design the application to withstand system failures. Similar to thoughts around idempotency above, designing for failures basically means allowing requests to be processed again; “try-again” essentially.

    Lastly, each of the topic areas above is worthy of an individual article and detailed analysis; and lots of content are available on the Web that provide a lot more insight. The point here is, each of the principles above actually has some relationship with, and dependency on, the others. It is the combination of these principles that contribute to an effective distributed computing, and cloud-optimized architecture.

  • Architecture + Strategy

    Run Java with GlassFish in Windows Azure


    At PDC10 (Microsoft’s Professional Developers Conference 2010), Microsoft has again provided affirmation of support for Java in Windows Azure. “We're making Java a first-class citizen with Windows Azure, the choice of the underlying framework, the choice of the development tool.”, said Bob Muglia (President of Server and Tools at Microsoft), during his keynote presentation (transcript). Then during PDC Vijay Rajagopalan delivered a session (Open in the Cloud: Windows Azure and Java) which provided more details on the state of many deliverables, including:

    Vijay also talked about, during his presentation, a successful deployment of Fujitsu’s Interstage application server (a Java EE 6 app server based on GlassFish) in Windows Azure. Plus a whole slew of base platform improvements announced via the Windows Azure team blog, which helped to address many of the limitations we observed last year, such as not being able to use NIO as described in my earlier work with Jetty and Windows Azure.

    Lots of great news, and I was finally able to sit down and try some things hands-on, with the latest release of the Windows Azure SDK (version 1.3; November 2010) that included many of the announced improvements.

    Java NIO now works!

    First off, one major limitation identified previously was that because of the networking sandbox model in Windows Azure (for security reasons) also blocked the loopback adapter which NIO needed. At PDC this was discussed, and the fact that Fujitsu Interstage app server worked (which used GlassFish which used NIO) proved this works. And fortunately, there isn’t anything additional we need to do to “enable” NIO; it just works now. I tried my simple Jetty Azure project by changing it back to using the org.eclipse.jetty.server.nio.SelectChannelConnector, deployed into Windows Azure, and it ran!

    Also worth noting was that the startup time in Windows Azure was significantly improved. My Jetty Azure project took just a few minutes to become accessible on the Internet (it had taken more than 20 minutes at one point in time).

    Mario Kosmiskas also posted an excellent article which showed Jetty and NIO working in Windows Azure (and many great tips which I leveraged for the work below).

    Deploying GlassFish

    Since Fujitsu Interstage (based on GlassFish) already works in Azure, GlassFish itself should work as well. So I thought I’d give it a try and learn from the exercise. First I tried to build on the Jetty work, but started running into issues with needing Visual Studio and the Azure tools to copy/move/package large amounts of files and nested folders when everything is placed inside of the project structure – GlassFish itself has 5000+ files and 1100+ folders (the resulting super-long file path names for a few files caused the issue). This became a good reason to try out the approach of loading application assets into role instances from Blob storage service, instead of packaging everything together inside of the same Visual Studio project (as is the best practice for ASP.NET Web Role projects).

    This technique was inspired by Steve Marx’s article last year (role instance using Blob service), and realized using the work from Mario Kosmiskas’ article (PowerShell magic), I was able to have an instance of GlassFish Server Open Source Edition 3.1 (build 37; latest at the time of this writing) deployed and running in Windows Azure, in a matter of minutes. Below is a screenshot of the GlassFish Administration Console running on a subdomain (in Azure).


    To do this, basically just follow the detailed steps in Mario Kosmiskas’ article. I will highlight the differences here, plus a few bits that weren’t mentioned in the article. Easiest way is to just reuse his Visual Studio project ( I started from scratch so that I could use GlassFish for various names.

    1. Create a new Cloud project, and add a Worker Role (I named mine GlassFishService). The project will open with a base project structure.

    2. Copy and paste in the files (from Mario’s project, under the ‘JettyWorkerRole’ folder, and pay attention to his comments on Visual Studio’s default UTF-8 enocding):

    • lib\ICSharpCode.SharpZipLib.dll
    • Launch.ps1
    • Run.cmd

    Paste them into Worker Role. For me, the resulting view in Solution Explorer is below:


    3. Open ServiceDefinition.cscfg, and add the Startup and Endpoints information. The resulting file should look something like this:

    <?xml version="1.0" encoding="utf-8"?>
    <ServiceDefinition name="GlassFishAzure" xmlns="">
      <WorkerRole name="GlassFishService">
          <Import moduleName="Diagnostics" />
          <Task commandLine="Run.cmd" executionContext="limited" taskType="background" />
          <InputEndpoint name="Http_Listener_1" protocol="tcp" port="8080" localPort="8080" />
          <InputEndpoint name="Http_Listener_2" protocol="tcp" port="8181" localPort="8181" />
          <InputEndpoint name="Http_Listener_3" protocol="tcp" port="4848" localPort="4848" />
          <InputEndpoint name="JMX_Connector_Port" protocol="tcp" port="8686" localPort="8686" />
          <InputEndpoint name="Remote_Debug_Port" protocol="tcp" port="9009" localPort="9009" />

    As you can see, the Startup element instructs Windows Azure to execute Run.cmd as a startup task. And the InputEndpoint elements are used to specify ports that GlassFish server needs to listen to for external connections.

    4. Open Launch.ps1 and make a few edits. I kept the existing functions unchanged. The parts that changed are shown below (and highlighted):

    $connection_string = 'DefaultEndpointsProtocol=http;AccountName=dachou1;AccountKey=<your acct key>
    # JRE
    $jre = ''
    download_from_storage 'java' $jre $connection_string (Get-Location).Path
    unzip ((Get-Location).Path + "\" + $jre) (Get-Location).Path
    # GlassFish
    $glassfish = ''
    download_from_storage 'apps' $glassfish $connection_string (Get-Location).Path
    unzip ((Get-Location).Path + "\" + $glassfish) (Get-Location).Path
    # Launch Java and GlassFish
    .\jre\bin\java `-jar .\glassfish3\glassfish\modules\admin-cli.jar start-domain --verbose

    Essentially, update the script with:

    • Account name and access key from your Windows Azure Storage Blob service (where you upload the JRE and GlassFish zip files into)
    • Actual file names you’re using
    • The appropriate command that references eventual file locations to launch the application (or call another script)

    The script above extracted both zip files into the same location (at the time of this writing, in Windows Azure, it is the local storage at E:\approot). So your commands need to reflect the appropriate file and directory structure based on how you put together the zip files.

    5. Upload the zip files into Windows Azure Storage Blob service. I followed the same conventions of placing them into ‘apps’ and ‘java’ containers. For GlassFish, I used the downloaded directly from For Java Runtime (JRE) I zipped the ‘jre’ directory under the SDK I installed. To do the upload, many existing tools can help you do this from a user’s perspective. I used Neudesic’s Azure Storage Explorer. As a result, Server Explorer in Visual Studio showed this view in Windows Azure Storage (and the containers would show the uploaded files):


    That is it! Now we can upload this project into Windows Azure and have it deployed. Just publish it and let Visual Studio and Windows Azure do the rest:


    Once completed, you could go to port 8080 on the Website URL to access GlassFish, which should show something like this:


    If you’d like to use the default port 80 for HTTP, just go back to ServiceDefinition.cscfg and update the one line into:

    <InputEndpoint name="Http_Listener_1" protocol="tcp" port="80" localPort="8080" />

    GlassFish itself will still listen on port 8080, but externally the Windows Azure cloud environment will receive user requests on port 80, and route them to the VM GlassFish is running in via port 8080.

    Granted, this is a very simplified scenario, and it only demonstrates that the GlassFish server can run in Windows Azure. There is still a lot of work that are needed to enable functionalities Java applications expect from a server, such as systems administration and management components, integration points with other systems, logging, relational database and resource pooling, inter-node communication, etc.

    In addition, some environmental differences such as Windows Azure being a stateless environment, non-persistent local file system, etc. also need to be mitigated. These differences make Windows Azure a little different from existing Windows Server environments, thus there are different things we can do with Windows Azure instead of using it simply as an outsourced hosting environment. My earlier post based on the JavaOne presentation goes into a bit more detail around how highly scalable applications can be architected differently in Windows Azure.

    Java deployment options for Windows Azure

    With Windows Azure SDK 1.3, we now have a few approaches we can pursue to deploy Java applications in Windows Azure. A high-level overview based on my interpretations:

    Worker Role using Visual Studio deployment package – This is essentially the approach outlined in my earlier work with Jetty and Windows Azure. With this approach, all of the files and assets (JRE, Jetty distributable, applications, etc.) are included in the Visual Studio project, then uploaded into Windows Azure in a single deployment package. To kick-off the Java process we can write bootstrapping code in the Worker Role’s entry point class on the C# side, launch scripts, or leverage SDK 1.3’s new Startup Tasks feature to launch scripts and/or executables.

    Worker Role using Windows Azure Storage Blob service – This is the approach outlined in Mario Kosmiskas’ article. With this approach, all of the files and assets (JRE, Jetty distributable, applications, etc.) are uploaded and managed in the Windows Azure Storage Blob service separately, independent of the Visual Studio project that defines the roles and services configurations. To kick-off the Java process we could again leverage SDK 1.3’s new Startup Tasks feature to launch scripts and/or executables. Technically we can invoke the script from the role entry class in C# too, which is what the initial version of the Tomcat Accelerator does, but Java teams may prefer an existing hook to call the script.

    Windows Azure VM Role – The new VM Role feature could be leveraged, so that we can build a Windows Server-based image with all of the application files and assets installed and configured, and upload that image into Windows Azure for deployment. But note that while this may be perceived as the approach that most closely aligns to how Java applications are deployed today (by directly working with the server OS and file system), in Windows Azure this means trading off the benefits of automated OS management and other cloud fabric features.

    And perhaps, with the new Remote Desktop feature in Windows Azure, we can probably manually install and configure Java application assets. But doing so sort of treats Windows Azure as a simple hosting facility (which it isn’t) and defeats the purpose of all the fault-tolerance, automated provisioning and management capabilities in Windows Azure. In addition, for larger deployments (many VM’s) it would become increasingly tedious and error-prone if each VM needs to be set up manually.

    In my opinion, the Worker Role using Windows Azure Storage Blob service approach is ideally suited for Java applications, for a couple of reasons:

    • All of the development and application testing work can still be accomplished in the tools you’re already using. Visual Studio and dealing with Windows Azure SDK and tools are only needed from a deployment perspective – deploying the script that launches the Java process
    • Managing individual zip files (or other archive formats) and at any granularity level, instead of needing to put everything into the same deployment package when using Visual Studio for everything
    • Loading application assets from the Windows Azure Storage Blob service also provides more flexibility for versioning, reuse (of components), and managing smaller units of changes. We can have a whole set of scripts that load different combinations of assets depending on version and/or intended functionality

    Perhaps, at some point we could just upload scripts into Blob service and the only thing is to tell the Windows Azure management portal to run, as the Startup Task, which scripts for which roles and instances from the assets we store in the Blob service. But there are still things we could do with Visual Studio – setting up IntelliTrace, facilitating certificates for Remote Desktop, etc. However, treating the Blob service as a distributed storage system, automating the transfer of specific components and assets from the storage to each role instance, and launching the Java process to run the application, could be a very interesting way for Java teams to run applications in Windows Azure platform.

    Lastly, this approach is not limited to Java applications. In fact, as long as any software components that can be loaded from Blob storage, then installed in an automated manner driven by scripts (and we may even be able to use another layer of script interpreters such as Cygwin), they can use this approach for deployment in Windows Azure. By managing libraries, files, and assets in Blob storage, and maintaining a set of scripts, a team can operate and manage multiple (and multiple versions of) applications in Windows Azure, with comparatively higher flexibility and agility than managing individual virtual machine images (especially for smaller units of changes).

  • Architecture + Strategy

    Building Highly Scalable Java Applications on Windows Azure (JavaOne 2010)


    075018_thumb6JavaOne has always been one of my favorite technology conferences, and this year I had the privilege to present a session there. Given my background in Java, previous employment at Sun Microsystems, and the work I’m currently doing with Windows Azure at Microsoft, it’s only natural to try to piece them together and find more ways to use them. Well, honestly, this also gives me an excuse to attend the conference, plus the co-located Oracle OpenWorld, along with 41,000 other attendees. ;)

    InfoQA related article published on InfoQ may also provide some context around this presentation. Plus my earlier post on getting Jetty to work in Azure -, which goes into a bit more technical detail on how a Java application can be deployed and run in Windows Azure.

    Java in Windows Azure

    So at the time of this writing, deploying and running Java in Windows Azure is conceptually analogous to launching a JVM and run a Java app from files stored on a USB flash drive (or files extracted from a zip/tar file without any installation procedures). This is primarily because Windows Azure isn’t a simple server/VM hosting environment. The Windows Azure cloud fabric provides a lot of automation and abstraction so that we don’t have to deal with server OS administration and management. For example, developers only have to upload application assets including code, data, content, policies, configuration files and service models, etc.; while the Windows Azure manages the underlying infrastructure:

    • application containers and services, distributed storage systems
    • service lifecycle, data replication and synchronization
    • server operating system, patching, monitoring, management
    • physical infrastructure, virtualization, networking
    • security
    • “fabric controller” (automated, distributed service management system)

    The benefit of this cloud fabric environment is that developers don’t have to spend time and effort managing the server infrastructure; they can focus on the application instead. However, the higher abstraction level also means we are interacting with sandboxes and containers, and there are constraints and limitations compared to the on-premise model where the server OS itself (or middleware and app server stack we install separately) is considered the platform. Some of these constraints and limitations include:

    • dynamic networking – requires interaction with the fabric to figure out the networking environment available to a running application. And as documented, at this moment, the NIO stack in Java is not supported because of its use of loopback addresses
    • no OS-level access – cannot install software packages
    • non-persistent local file system – have to persist files elsewhere, including log files and temporary and generated files

    These constraints impact Java applications because the JVM is a container itself and needs this higher level of control, whereas .NET apps can leverage the automation enabled in the container. Good news is, the Windows Azure team is working hard to deliver many enhancements to help with these issues, and interestingly, in both directions in terms of adding more higher-level abstractions as well as providing more lower-level control.

    Architecting for High Scale

    So at some point we will be able to deploy full Java EE application servers and enable clustering and stateful architectures, but for really large scale applications (at the level of Facebook ad Twitter, for example), the current recommendation is to leverage shared-nothing and stateless architectures. This is largely because, in cloud environments like Azure, the vertical scaling ceiling for physical commodity servers is not very high, and adding more nodes to a cluster architecture means we don’t get to leverage the automated management capabilities built into the cloud fabric. Plus the need to design for system failures (service resiliency) as opposed to assuming a fully-redundant hardware infrastructure as we typically do with large on-premise server environments.


    (Pictures courtesy of LEGO)

    The top-level recommendation for building a large-scale application in commodity server-based clouds is to apply more distributed computing best practices, because we’re operating in an environment with more smaller servers, as opposed to fewer bigger servers. The last part of my JavaOne presentation goes into some of those considerations. Basically - small pieces, loosely coupled. It’s not like the traditional server-side development where we’d try to get everything accomplished within the same process/memory space, per user request. Applications can scale much better if we defer (async) and/or parallelize as much work as possible; very similar to Twitter’s current architecture. So we could end up having many front-end Web roles just receiving HTTP requests, persist some data somewhere, fire off event(s) into the queue, and return a response. Then another layer of Worker roles can pick up the messages from the queue and do the rest of the work in an event-driven manner. This model works great in the cloud because we can scale the front-end Web roles independently of the back-end Worker roles, plus not having to worry about physical capacity.


    In this model, applications need to be architected with these fundamental principles:

    • Small pieces, loosely coupled
    • Distributed computing best practices
      • asynchronous processes (event-driven design)
      • parallelization
      • idempotent operations (handle duplicity)
      • de-normalized, partitioned data (sharding)
      • shared nothing architecture
      • optimistic concurrency
      • fault-tolerance by redundancy and replication
      • etc.

    Thus traditionally monolithic, sequential, and synchronous processes can be broken down to smaller, independent/autonomous, and loosely coupled components/services. As a result of the smaller footprint of processes and loosely-coupled interactions, the overall architecture will observe better system-level resource utilization (easier to handle more smaller and faster units of work), improved throughput and perceived response time, superior resiliency and fault tolerance; leading to higher scalability and availability.

    Lastly, even though this conversation advocates a different way of architecting Java applications to support high scalability and availability, the same fundamental principles apply to .NET applications as well.

  • Architecture + Strategy

    Windows Phone 7 Developer Launch Event


    Join a select group of developers for an event near you and get hands-on experience with Windows Phone 7.

    Windows Phone 7 Developer Launch

    Windows Phone 7 is here – and with it comes a new world of opportunity for passionate, creative developers. Windows Phone 7 gives you the power to build complex, robust applications using consistent hardware specs, a comprehensive development toolkit, and the all-new, full-service Marketplace for selling your apps. Get ready to capitalize on this exciting new frontier with two days of fast-paced learning and Windows Phone 7 development sessions. Pick the day that best fits your needs – or join us for both. Either way, you'll get the information you need to build high-demand apps with Windows Phone 7.

    Day 1: Jump-Start Your Mobile Development | 8:30am - 5:15pm
    In the first of this two-day launch event, we'll take you under the hood of Windows Phone 7 and the Windows Phone 7 platform with a progressive set of learning sessions. We'll start with the basic tools and fundamentals of Windows Phone 7 application development and as the day unfolds, we'll go deeper into development scenarios using Silverlight, XNA and the Windows Phone 7 SDK. You'll also see how to earn cash for your apps in the fully loaded Marketplace.

    Day 2: Unleash Your Best App Workshop | 9:00am - 4:00pm
    This hands-on workshop is designed to help you turn those napkin sketches and subway scribbles into real, sellable apps. You'll apply fundamental Windows Phone 7 design principles to build an app and upload it to the fully revamped Marketplace. Go at your own pace or follow along with a proctored group lab. Either way, you'll get step-by-step advice from Microsoft and community experts. It's an unprecedented opportunity to stake your claim in the marketplace – using familiar tools and consistent specs.

    Looking forward to seeing you there!

    Follow us on:




    clip_image011Orange County, CA
    Hilton Orange County
    September 29 - 30, 2010


    Day 1: Jump-Start





    Day 2: Workshop



    Mountain View, CA

    Microsoft Silicon Valley Office
    October 12 - 13, 2010


    Day 1: Jump-Start





    Day 2: Workshop



    San Francisco, CA

    San Francisco Design Center
    October 20 - 21, 2010


    Day 1: Jump-Start





    Day 2: Workshop



    Seating is limited. Register online or call 1-877-MSEVENT.

    These dates don't work?
    Watch the Windows Phone 7 Launch Event online, live
    from Mountain View, CA.
    October 12, 2010
    For more Windows Phone 7 developer events, visit:



  • Architecture + Strategy

    International SOA & Cloud Symposium


    International SOA & Cloud Symposium
    October 5-6 2010, Berliner Congress Center, Berlin Germany

    For the first time in Germany! With 100 speaker sessions across 17 tracks, the 3rd International SOA and Cloud Symposium will feature the top experts from around the world. There will also be a series of post-conference certification workshops running from October 7-13, 2010, including (for the first time in Europe) the Certified Cloud Computing Specialist Workshop.

    The New Agenda for 2010... now online and already contains internationally recognized speakers from companies such as Microsoft, HP, IBM, Oracle, SAP, Amazon, Forrester, McKinsey & Company, Red Hat, TIBCO, Layer7, SOA Systems, CGI, Ordina, Logica, Booz Allen Hamilton, as well as organizations such as The SOA Innovation Lab, BIAN, the US Department of Defense, and many more. International experts including Thomas Erl, Dirk Krafzig, Stefan Tilkov, Clemens Utschig, Helge Buckow, Mark Little, Matt Wood, Brian Loesgen, John deVadoss, Nicolai Josuttis, Tony Shan, Paul C. Brown, Satadru Roy and many more.

    For further information, see

    To register, visit at or

    • Use ‘Partner Code’: SOA10-SPK26-10 to get a discount of € 109
    • Additional Early Bird discount of € 109 if registering before September 1
Page 7 of 28 (137 items) «56789»