Architecture + Strategy

Musings from David Chou - Architect, Microsoft

  • Architecture + Strategy

    Cloud Computing and the Microsoft Platform

    • 9 Comments

    It has been a couple of months since I wrote about cloud computing and Microsoft’s plans and strategies. Now that Azure Services Platform has been unveiled at PDC2008, and after having the opportunities to discuss it with a community of architects from major enterprises and startups via the Architect Council series of events, I can talk about cloud computing from the perspective of the Microsoft platform, and the architectural considerations that influenced its design and direction.

     

    Okay – cloud computing today is a really overloaded term, much more than SOA (service-oriented architecture) when it was the hottest ticket in IT. There are a lot of different perspectives on cloud computing, adding to the confusion and the hype. And unsurprisingly, there are a lot of confusion around Microsoft’s cloud platform too. So here is one way of looking at it.

    Microsoft’s cloud includes SaaS (Software-as-a-Service) offerings as shown in the top row of the above diagram, such as Windows Live and the Business Productivity Online Suite; and the PaaS (Platform-as-a-Service) offering currently branded as the Azure Services Platform. For the rest of this article we will focus on the Azure Services Platform, as it represents a platform on top of which additional capabilities can be developed, deployed, and managed.

    Comprehensive Software + Services Platform

    At Microsoft, we believe that the advent of cloud computing does not necessitate that existing (or legacy) IT assets be moved into the cloud, as it makes more sense to extend to the cloud as opposed to migrate to the cloud. We think that eventually, a hybrid world of on-premise software and cloud-based services will be the majority norm, although the balancing point between the two extremes may vary greatly among organizations of all types and sizes. As a platform company, Microsoft’s intention is to provide a platform that can support the wide range of scenarios in that hybrid world, spanning the spectrum of choices between on-premises software and cloud-based services.

    Thus Microsoft’s cloud platform, from this perspective, is not intended to replace the existing on-premises software products such as our suite of Windows Server products, but rather, completes the spectrum of choices and the capabilities required for a Software + Services model.

    Cloud Platform as a Next-Generation Internet-Scaled Application Environment

    So what is a cloud platform? It should provide an elastic compute environment that offers auto-scalability (small to massive), and ~100% availability. However, while some think that the compute environment means a server VM (virtual machine) allocation/provisioning facility that provides servers (i.e., Windows Servers, Linux Servers, Unix Servers, etc.) for administrators to deploy applications into, Microsoft’s approach with the Azure Services Platform is remarkably different.

    Azure Services Platform is intended to be a platform to support a “new class of applications” – cloud applications.

    On the other hand, the Azure Services Platform is not a different location to host our existing database-driven applications such as traditional ASP.NET web apps or third-party packaged applications deployed on Windows Server. Cloud applications are a different breed of applications. Now, the long-term roadmap does include capabilities to support Windows-Server-whichever-way-we-want-it, but I think the most interesting/innovative part is allowing us to architect and build cloud applications.

    To clarify, let us take a quick look at the range of options from an infrastructure perspective.

    The diagram above provides a simplified/generalized view of choices we have from a hosting perspective:

    • On-premises: represents the traditional model of purchasing/licensing and acquiring software, install them, and manage them in our own data centers
    • Hosted: represents the co-location or managed outsourced hosting services. For example, GoGrid, Amazon EC2, etc.
    • Cloud: represents cloud fabric that provides higher-level application containers and services. For example, Google App Engine, Amazon S3/SimpleDB/SQS, etc.

    From this perspective, “Hosted” represents services that provide servers-at-my-will, but we will interact with the server instances directly, and manage them at the server level so we can configure them to meet our requirements, and install/deploy applications and software just as we have done with existing on-premises software assets. These service providers manage the underlying infrastructure so we only have to worry about our servers, but not the engineering and management efforts required to achieve auto-scale and constant availability.

    “Cloud” moves the concerns even higher up the stack, where application teams only need to focus on managing the applications and specifying to the environment their security and management policies, and the cloud infrastructure will take care of everything else. These service providers manage the application runtimes, so we can focus on deploying and managing business capabilities, as well as higher-level and differentiating aspects such as user experience, information architecture, social communities, branding, etc.

    However, this does not mean that any one of these application deployment/hosting models is inherently better than the other. Yep, while most people look at “hosted” and “cloud” models as described here, both as cloud platforms, they are not necessarily more relevant than the on-premises model for all scenarios. These options all present varying trade-offs that we as architects need to understand, in order to make prudent choices when evaluating how to adopt or adapt to the cloud.

    Trade-Offs in the Cloud

    Let us take a closer look at the trade-offs between the on-premises model and the cloud (as differences between “hosted” and “cloud” models are comparatively less).

    At the highest level, we are looking at trade-offs between data consistency and scalability/availability. This is a fundamental difference between on-premises and cloud-based architectures, as “traditional” on-premises system architectures are optimized to provide near-real-time data consistency (sometimes at the cost of scalability and availability), whereas cloud-based architectures are optimized to provide scalability and availability (by compromising data consistency).

    One way to look at this, for example, is how we used to design and build systems using on-premises technologies. We used pessimistic locking, optimistic locking, two-phase commit, etc., methods to ensure proper handling of updates to a database via multiple threads. And this focus on ensuring the accuracy and integrity of the data was deemed one of the most important aspects in modern IT architectures. However, data consistency is achieved by compromising concurrency. For example, in DBMS design, the lowest transaction isolation level “serializable” means all transactions occur in a serial manner (in a way, single-threaded) which promises safe updates from multiple clients. But that adversely impacts performance and scalability in highly concurrent systems. Raising the isolation level helps to improve concurrency, but the database loses some control over data integrity.

    Furthermore, as we look at many of the Internet-scale applications, such as Amazon S3/SimpleDB, Google BigTable, and the open source Hadoop; their designs and approaches are very different from traditional on-premises RDBMS software. Their primary goal is to provide scalable and performant databases for extremely large data sets (lots of nodes and petabytes of data), which resulted in trading off some aspects of data integrity and required users to accommodate data that is “eventually consistent”.

    Amazon Web Services CTO, Werner Vogels, has recently updated his thoughts on “eventual consistency” in highly distributed and massively scaled architectures. An excellent read for more details behind the fundamental principles that contribute to this trade-off between the two models.

    Thus, on-premises and cloud-based architectures are optimized for different things. And that means on-premises platform are still relevant, for specific purposes, just as cloud-based architectures. We just need to understand the trade-offs so each can be used effectively for the right reasons.

    For example, an online retailer’s product catalog and storefront applications, which are published/shareable data that need absolute availability, are prime candidates to be built as cloud applications. However, once a shopping cart goes into checkout, then that process can be brought back into the on-premise architecture integrated with systems that handle order processing and fulfillment, billing, inventory control, account management, etc., which demand data accuracy and integrity.

    The Microsoft Platform

    I hope it’s kind of clear why Microsoft took this direction in building out the Azure Services Platform. For example, the underlying technologies used to implement Azure include Windows Server 2008, but Microsoft decided to call the compute capability Windows Azure, because it represents application containers that operate at a higher level in the stack, instead of Windows Server VM instances for us to use directly. In fact, it actually required more engineering effort this way, but the end result is a platform that provides extreme scalability and availability, the transparency of highly distributed and replicated processes and data, while hiding the complexities of the systems automation and management operations on top of a network of globally distributed data centers. This should help clarify, at a high level, as to how Azure can be used to extend existing/legacy on-premise assets, instead of being just another outsourced managed hosting location.

    Of course, this is only what this initial version of the platform looks like. From a long-term perspective, Microsoft does plan to increase parity between the on-premise and cloud-based platform components, especially from a development and programming model perspective, so that the applications can be more portable across the S+S spectrum. But the fundamental differences will still exist, which will help to articulate the distinct values provided by different parts of the platform.

    Thus the Azure Services Platform is intended for a “new class of applications”. Different from the traditional on-premise database-driven applications, the new class of “cloud applications” are increasingly more “services-driven”, as applications operate in a service-oriented environment, where data can be managed and provisioned as services by cloud-based database service providers such as Amazon S3/SimpleDB, Google MapReduce/BigTable, Azure SQL Services, Windows Azure Storage Services, etc., and capabilities integrated from other services running in the Web, provisioned by various private and public clouds. This type of applications inherently operate on an Internet scale, and are designed with a different set of fundamentals such as eventual consistency, idempotent processes, federated identity, services-based functional partitioning and composition (loose-coupling), isolation, parallel and replicated data and process architecture, etc.

    This post is part of a series of articles on cloud computing and related concepts.

  • Architecture + Strategy

    Run Java with Jetty in Windows Azure

    • 11 Comments
    Windows Azure

    [Update 2011.01.17]NIO is no longer an issue in Windows Azure with SDK 1.3 (see post for more details)

    [Update 2010.03.28] Included Jetty configuration information (see “Configure Jetty” section below)

    Jetty is a Java-based, open source Web Server which provides a HTTP server and Servlet container capable of serving static and dynamic content either from a standalone or embedded instantiations. Jetty is used by many popular projects such as the Apache Geronimo JavaEE compliant application server, Apache ActiveMQ, Apache Cocoon, Apache Hadoop, Apache Maven, BEA WebLogic Event Server, Eucalyptus, FioranoMQ Java Messaging Server, Google App Engine and Web Toolkit plug-in for Eclipse, Google Android, RedHat JBoss, Sonic MQ, Spring Framework, Sybase EAServer, Zimbra Desktop, etc. (just to name a few).

    The Jetty project provides:

    • Asynchronous HTTP Server
    • Standard based Servlet Container
    • Web Sockets server
    • Asynchronous HTTP Client
    • OSGi, JNDI, JMX, JASPI, AJP support

    From a application container perspective, Jetty can be used as an alternative deployment approach for the most popular frameworks in Java, such as Spring (and many of its sub-projects), EJB containers, integration with JEE servers as mentioned above, and a lot more, most supported via configuration as opposed to code-level integrations.

    Java Support in Windows Azure

    Since PDC09 (Professional Developers Conference), we’ve announced support for running Java and Tomcat in Windows Azure, and highlighted a project at Domino’s Pizza that ran a version of their online pizza ordering website in Windows Azure. Below is a short list of current resources that provide information on Java support in Windows Azure:

    However, this doesn’t mean that Tomcat is the only Java application container supported in Windows Azure. In fact, the approach basically consists of rolling in your own JRE (Java runtime), and any Java package that can be instantiated via the command line (instead of needing to install into the O/S). This Java application can then be packaged into a Worker Role application, then deployed into Windows Azure.

    So why a Worker Role when we have a Web Role in Windows Azure? A Web Role essentially uses an IIS front-end, thus it supports ASP.NET applications, and any FastCGI extensions (Java is supported there too, but I’ll save that for another post). But a Worker Role gives us a bit more flexibility, as a Web Role may define a single HTTP endpoint and a single HTTPS endpoint for external clients, whereas a Worker Role may define any number of external endpoints using HTTP, HTTPS, or TCP. Each external endpoint defined for a role must listen on a unique port.

    Thus for things like Tomcat (and Jetty) which want to do their own listening on ports as defined, is more suitable for Worker Roles in Windows Azure. And to do that, a bit of plumbing is required. The biggest challenge is actually hooking up the physical port that the Windows Azure fabric controller assigns to an instance of the Worker Role, even though logically the pool of Worker Roles are intended to receive external HTTP traffic on port 80 (or any port of your choosing). This is because internally inside of the Windows Azure environment, multiple VMs can reside on a single physical box, and that the load balancer may need to forward requests to dynamically provisioned instances residing in different locations, thus the fabric controller uses a different set of internal ports for this internal portion of the communication. Again for Web Roles we don’t need to be concerned with this as the fabric controller automatically configures IIS to listen on the correct internal port.

    And this plumbing is exactly what the Tomcat Solution Accelerator provides (essentially making a call to the Worker Role environment and finding the right port, then making that change in server.xml for Catalina in Tomcat to pick up, then start the server to listen on that port), but if we want to do something else, like Jetty, then we need to do this plumbing ourselves. But don’t worry, it’s actually pretty simple, and really opens up the opportunity to deploy all kinds of stuff into Windows Azure.

    Run Jetty in Windows Azure in <10 Minutes

    Below is a screenshot of my little Worker Role in Windows Azure running a Jetty Web server on port 80 (don’t try the URL; I didn’t leave the application running).

    jetty-azure

    Worker Role Implementation

    To do this, basically just start with a new Cloud project in Visual Studio, and add one Worker Role (just the same as a “Hello World” Worker Role app). And below is the only code I added inside of the Run() method in the WorkerRole class (minus the tracing code which I removed from this view):

    string response = "";
    try
    {
        System.IO.StreamReader sr;
        string port = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["HttpIn"].IPEndpoint.Port.ToString();
        string roleRoot = Environment.GetEnvironmentVariable("RoleRoot");
        string jettyHome = roleRoot + @"\approot\app\jetty7";
        string jreHome = roleRoot + @"\approot\app\jre6";
        Process proc = new Process();
        proc.StartInfo.UseShellExecute = false;
        proc.StartInfo.RedirectStandardOutput = true;
        proc.StartInfo.FileName = String.Format("\"{0}\\bin\\java.exe\"", jreHome);
        proc.StartInfo.Arguments = String.Format("-Djetty.port={0} -Djetty.home=\"{1}\" -jar \"{1}\\start.jar\"", port, jettyHome);
        proc.EnableRaisingEvents = false;
        proc.Start();
        sr = proc.StandardOutput;
        response = sr.ReadToEnd();
    }
    catch (Exception ex)
    {
        response = ex.Message;
        Trace.TraceError(response);
    }

    Essentially, all this does is simply starting a new sub-process in the Worker Role instance to host and run the JRE and Jetty. The rest is figuring out how to provide the correct arguments to the process so that Jetty will load properly and it will listen to the port that this particular Worker Role instance is assigned to, as the forwarding port from the load balancer.

    And of course, this is a pretty simplified scenario (and also because Jetty is pretty easy to launch), though more complicated configuration can also be supported, such as setting environment variables, changing XML configuration files (like what we have to do with Tomcat’s server.xml), executing a .bat file or other scripts, starting up multiple processes, etc.

    Add the Java and Jetty Assets

    Next, copy and paste in the JRE and Jetty files into the WorkerRole’s folder. For me I put them under another folder called “app”, and in their individual folders “jre6” and “jetty7”. Under solution explorer in Visual Studio, this looks like:

    image

    Configure the External Input Endpoint

    And the last significant change is to ServiceDefinition.csdef, by adding the below to the “WorkerRole” element:

    <Endpoints>
      <InputEndpoint name="HttpIn" port="80" protocol="tcp" />
    </Endpoints>

    Which is required to tell the fabric controller that the external input end point for this application is port 80.

    Configure Jetty (updated 2010.03.28)

    [Update 2011.01.17]NIO is no longer an issue in Windows Azure with SDK 1.3 (see post for more details). No need to replace NIO with BIO connector as described below.

    There are many ways Jetty can be configured to run in Azure (such as for embedded server, starting from scratch instead of the entire distribution with demo apps), thus earlier I didn’t include how my deployment was configured, as each application can use different configurations, and I didn’t think my approach was necessarily the most ideal solution. Anyway, here is how I configured Jetty to run for the Worker Role code shown in this post.

    First, I had to change the default NIO ChannelConnector that Jetty was using, from the new/non-blocking I/O connector to the traditional blocking IO and threading model BIO SocketConnector (because the loopback connection required by NIO doesn’t seem to work in Windows Azure).

    This can be done in etc/jetty.xml, where we set connectors, by editing the New element to use org.eclipse.jetty.server.bio.SocketConnector instead of the default org.eclipse.jetty.server.nio.SelectChannelConnector, and remove the few additional options for the NIO connector. The resulting block looks like this:

    <Call name="addConnector">
      <Arg>
        <
    New class="org.eclipse.jetty.server.bio.SocketConnector">
          <Set name="host"><SystemProperty name="jetty.host" /></Set>
          <Set name="port"><SystemProperty name="jetty.port" default="8080" /></Set>
          <Set name="maxIdleTime">300000</Set>
        </New>
      </Arg>
    </Call>

    Now, I chose the approach to use JRE and Jetty files simply as application assets (read-only), mostly because this is a simpler exercise, and that I wanted to streamline the Worker Role implementation as much as possible, without having to do more configurations on the .NET side than necessary (such as due to dependencies on allocating a separate local resource/storage for Jetty, copy files over from the deployment, and use that as a runtime environment where it can write to files).

    As a result of this approach, I needed to make sure that Jetty doesn’t need to write to any files locally when the server is started. Description of what I did:

    • etc/jetty.xml – commented out the default “RequestLog” handler so that Jetty doesn’t need to write to that log
    • etc/jetty.xml – changed addBean “org.eclipse.jetty.deploy.WebAppDeployer”’s “extract” property to “false” so that it doesn’t extract the .war files
    • contexts/test.xml – changed <Set name="extractWAR">’s property to “false” so that it doesn’t extract the .war files as well

    This step above is considered optional, as you can also create a local storage and copy the jetty and jre directories over, then launch the JRE and Jetty using those files. But you’ll also need more code in the Worker Role to support that, as well as using a different location to find the JRE and Jetty to launch the sub-process (the Tomcat Solution Accelerator used that approach).

    Download the project

    Download the project
    The entire solution (2.3MB) including the Jetty assets as configured (but minus the ~80MB of JRE binaries), can be downloaded from Skydrive.

    Summary

    The approach I described in this post uses JRE and Jetty files simply as application assets, since from a development process perspective I was thinking that development and testing of the Java applications to be deployed in Jetty should be done outside of the Windows Azure environment, and Windows Azure can be viewed as the staging/production environment for the applications. From that perspective, we shouldn’t need to have hot deployment of applications into Jetty, and other things like log files written and data/content used by Jetty and the applications, should be persisted to and retrieved from Windows Azure Storage and/or SQL Azure services.

    The implementation is really just intended for deployment into the Windows Azure cloud environment (or the local dev fabric for testing purposes). For developing applications to run in Jetty, we can use any tools we prefer in the Java ecosystem – Eclipse, NetBeans, IntelliJ, Emacs, etc., and do all of the unit testing and packaging into supported forms (such as .war packages). That way, development efforts on the Java side can be just as productive, while the only work that we really need to do is a one-time integration and configuration of the staging/production runtime, if we choose to use Windows Azure as the cloud platform to host an application.

    Similarly, for developing and unit testing the C# managed code inside the Worker Role, we also don’t have to always work inside the Windows Azure environment. In fact, it may be quicker to have the same code written as a Console application and test it there first, then port over to the Worker Role as by that time we’d only need to be concerned with integrating with specific items inside of the Windows Azure environment. And of course, we can re-factor the code to be more parameter-driven so the Java integration code can be nicely decoupled from the Windows Azure integration code.

    Anyway, simple exercise done. Now we can try some more complicated things such as Geronimo, GlassFish, Hadoop, etc. Stay tuned for upcoming posts about these efforts.

    Lastly, these are the tools I used:

  • Architecture + Strategy

    Describing Cloud Computing

    • 8 Comments

    Cloud computing, the buzzword du jour and hottest cliche in IT at the moment, is a source for extensive debates as well as general confusion. Just like the other buzzwords, SOA, Web 2.0, Software-as-a-Service (SaaS), "cloud computing" is a very general term, and there are many interpretations of what it means.

    And just like those buzzwords, cloud computing represents the convergence of many other megatrends and evolutionary developments leading up to this point. For example, SOA, Web as a platform, SaaS, utility computing, grid architectures, on-demand computing, outsourced hosting, managed/application service providers, etc., can all be considered contributing components to cloud computing.

    This slide from Gartner has a nice list of some common misunderstandings of cloud computing.

    So what is cloud computing then? Taking a similar approach to my last year's post on Web as a platform, a formal definition of cloud computing won't be attempted here. Instead, we will try to describe the emerging phenomenon by looking at the components and contributing aspects and patterns, as well as some of the current issues and challenges.

    Fundamental Aspects of Cloud Computing

    Again, this is just one way of looking at cloud computing, among many others. Basically, a couple of fundamental aspects I think are relevant:

    • Service Delivery
    • Service Composition
    • Service Consumption
    • Economies of Scale
    • Network Effect
    • Ubiquity / Pervasiveness

    Service Delivery

    The cloud has become an alternative means of delivering various types of capabilities as services. Software-as-a-service (SaaS) is one that receives a lot of attention, but from a cloud computing perspective, it can be any type of technology-enabled capabilities, delivered as-a-service. For example, virtualized storage, communication, compute capacity (utility computing), are all existing forms of capabilities delivered as services, but are not categorically defined as software capabilities.

    In the SaaS space, Salesforce.com has been cited as one of the prime examples, as a complete end-to-end solution delivered as a service. Earlier examples include outsourced hosting providers, managed service providers, etc. Today the cloud-based delivery model is being applied across all types of scenarios, including delivering traditional desktop applications as cloud-based services, such as Google Apps and Zoho Office. Microsoft of course has its own set of cloud-based services as well.

    Service Composition

    Not all services are of the same type. In fact many types exist as well as many ways to categorize/describe them. They include infrastructure, platforms, building blocks, etc., that operate on a technical level, to finished services that users can interact with directly.

    The fundamental aspect is that individual services can sometimes be integrated or composed or aggregated into higher-level services that perform a specific function available in the cloud. This is essentially the extension of SOA principles into the cloud, and enabling the Web as a platform.

    Service Consumption

    Web browsers will continue to serve as the primary means of accessing cloud-based services. But the cloud does not dictate the use of browsers as the only means to do so. In fact, cloud-based services provide the ideal environment to enable users consume services from all kinds of interfaces and devices.

    We are already seeing many vendors pushing the envelope beyond HTML-based browser interfaces. For example, on-line in-browser RIA platforms such as Adobe Flash, Curl, JavaFX, and Microsoft Silverlight, etc.; off-line desktop RIA platforms such as Adobe AIR, Google Gears, Yahoo Konfabulator, and Microsoft Live Mesh; and increasingly, mobile device platforms such as Apple iPhone, Google Android, Java ME, Adobe Flash Lite, Yahoo GO!, Microsoft Silverlight, etc.

    The fundamental aspect is that, cloud computing enables a seamless and consistent service consumption experience regardless of the type of device or interface used at the edge. In addition, differentiated and specialized user experiences can be delivered to different types of devices and form factors.

    Economies of Scale

    The most effective design for cloud-based services use multi-tenancy architectures, and standardize on operational and management processes. This allows service providers to share costs between multiple users, which results in lowered overall cost of service. And as the scale of usage/adoption increases, so do the economies of scale. This is not limited to managing costs; monetization models behave similarly such as in the case of Google's search advertising business.

    The cloud provides the ideal environment for service providers to leverage massively-scaled capabilities to achieve intended economies. But this does not mean all services providers have to plan for similar scales. For example, Salesforce.com's scale in terms of managing its userbase, is significantly different from the type of scale that Google or Amazon has to deal with.

    Network Effect

    The cloud is more than a huge collection of isolated services and islands of services. Its true power is unlocked when individual services become more inter-connected and inter-dependent. With increasing levels of interoperability, composability, data portability, semantic consistency, etc., the cloud moves away from disconnected services that require specialized integration efforts, to a dynamic system that is much more interactive and intelligent.

    Ubiquity / Pervasiveness

    The word cloud implies a level of ethereal ubiquity and pervasiveness. And cloud-based services should be provisioned such that users should not have to be concerned with how to access these services, or differences between platforms. In addition, details of the technical implementations, scaling options, etc. should be completely transparent to the users as well.

    Users should be able to interact with cloud-based services in a consistent context, even when navigating between the public cloud and private clouds.

    Challenges and Issues

    Undoubtedly, cloud computing presents many disruptive forces to existing models of using on-premise and locally installed software and applications. But some questions are still relevant. For example,

    • Who owns sensitive and confidential data?
    • How is SLA managed and enforced/guaranteed?
    • Regulatory compliance?
    • More potential points of failure?
    • Typical operational concerns, such as fine-grained control over security/privacy, governance, support for troubleshooting, overall performance, disaster recovery, planned downtimes, change management, etc.
    • Does it really cost less to users?
    • How will I differentiate?

    Another Gartner slide has a good list of common concerns.

    For example, some of the recent reported issues underscore the difficulty in maintaining massively-scaled cloud services (of course Microsoft had its share of outages too):

    And a recent BusinessWeek article "On-Demand Computing: A Brutal Slog" pointed out:

    "Funny thing about the Web, though. It's just as good at displacing revenue as it is in generating sources of it. Just ask the music industry or, ahem, print media. Think Robin Hood, taking riches from the elite and distributing them to everyone else, including the customers who get to keep more of their money and the upstarts that can more easily build competing alternatives."

    And Nicholas Carr's post back in 2005 on "The amorality of Web 2.0" succinctly points out the changes in monetization models in the shift towards the Web.

    While we can expect that the industry as a whole will continue to move into the cloud, and gain more maturity in deploying cloud-based services, these fundamental questions and challenges will have to be addressed as well. Although, some issues inherent in highly distributed architectures will persist and need to be evaluated as trade-offs.

    Forward Looking Perspective

    It seems we are in a land-grab mode today, with organizations rushing towards the cloud, including Microsoft. Next we should expect to see the similar hype-boom-bust cycle play out, where many attempts will fade into non-existence, major consolidations occur in the market, and finally a handful of strong players remain standing and owning a major portion of the collective cloud.

    However, when the dust (or the pendulum) finally settles, will we see cloud computing supplanting the "legacy" models? Most likely not, though cloud computing will become one of the viable options for organizations to choose from. And we can expect that there will not be a one-size-fits-all approach on a "right" way to decide between on-premise and cloud-based services. Just like all previous megatrends such as SOA and Web 2.0, each organization may have a different mix of options and the degree to which they are adopted.

    Technically, we can anticipate advances towards context-aware computing as a convergence of semantic Web, data portability, derived collective intelligence, intent-driven systems, etc.; which should result in a much more intelligent and interactive cloud.

    This post is part of a series of articles on cloud computing and related concepts.

     

  • Architecture + Strategy

    Run Java with GlassFish in Windows Azure

    • 6 Comments

    At PDC10 (Microsoft’s Professional Developers Conference 2010), Microsoft has again provided affirmation of support for Java in Windows Azure. “We're making Java a first-class citizen with Windows Azure, the choice of the underlying framework, the choice of the development tool.”, said Bob Muglia (President of Server and Tools at Microsoft), during his keynote presentation (transcript). Then during PDC Vijay Rajagopalan delivered a session (Open in the Cloud: Windows Azure and Java) which provided more details on the state of many deliverables, including:

    Vijay also talked about, during his presentation, a successful deployment of Fujitsu’s Interstage application server (a Java EE 6 app server based on GlassFish) in Windows Azure. Plus a whole slew of base platform improvements announced via the Windows Azure team blog, which helped to address many of the limitations we observed last year, such as not being able to use NIO as described in my earlier work with Jetty and Windows Azure.

    Lots of great news, and I was finally able to sit down and try some things hands-on, with the latest release of the Windows Azure SDK (version 1.3; November 2010) that included many of the announced improvements.

    Java NIO now works!

    First off, one major limitation identified previously was that because of the networking sandbox model in Windows Azure (for security reasons) also blocked the loopback adapter which NIO needed. At PDC this was discussed, and the fact that Fujitsu Interstage app server worked (which used GlassFish which used NIO) proved this works. And fortunately, there isn’t anything additional we need to do to “enable” NIO; it just works now. I tried my simple Jetty Azure project by changing it back to using the org.eclipse.jetty.server.nio.SelectChannelConnector, deployed into Windows Azure, and it ran!

    Also worth noting was that the startup time in Windows Azure was significantly improved. My Jetty Azure project took just a few minutes to become accessible on the Internet (it had taken more than 20 minutes at one point in time).

    Mario Kosmiskas also posted an excellent article which showed Jetty and NIO working in Windows Azure (and many great tips which I leveraged for the work below).

    Deploying GlassFish

    Since Fujitsu Interstage (based on GlassFish) already works in Azure, GlassFish itself should work as well. So I thought I’d give it a try and learn from the exercise. First I tried to build on the Jetty work, but started running into issues with needing Visual Studio and the Azure tools to copy/move/package large amounts of files and nested folders when everything is placed inside of the project structure – GlassFish itself has 5000+ files and 1100+ folders (the resulting super-long file path names for a few files caused the issue). This became a good reason to try out the approach of loading application assets into role instances from Blob storage service, instead of packaging everything together inside of the same Visual Studio project (as is the best practice for ASP.NET Web Role projects).

    This technique was inspired by Steve Marx’s article last year (role instance using Blob service), and realized using the work from Mario Kosmiskas’ article (PowerShell magic), I was able to have an instance of GlassFish Server Open Source Edition 3.1 (build 37; latest at the time of this writing) deployed and running in Windows Azure, in a matter of minutes. Below is a screenshot of the GlassFish Administration Console running on a cloudapp.net subdomain (in Azure).

    admin-console

    To do this, basically just follow the detailed steps in Mario Kosmiskas’ article. I will highlight the differences here, plus a few bits that weren’t mentioned in the article. Easiest way is to just reuse his Visual Studio project (MinimalJavaWorkerRole.zip). I started from scratch so that I could use GlassFish for various names.

    1. Create a new Cloud project, and add a Worker Role (I named mine GlassFishService). The project will open with a base project structure.

    2. Copy and paste in the files (from Mario’s project, under the ‘JettyWorkerRole’ folder, and pay attention to his comments on Visual Studio’s default UTF-8 enocding):

    • lib\ICSharpCode.SharpZipLib.dll
    • Launch.ps1
    • Run.cmd

    Paste them into Worker Role. For me, the resulting view in Solution Explorer is below:

    image

    3. Open ServiceDefinition.cscfg, and add the Startup and Endpoints information. The resulting file should look something like this:

    <?xml version="1.0" encoding="utf-8"?>
    <ServiceDefinition name="GlassFishAzure" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
      <WorkerRole name="GlassFishService">
        <Imports>
          <Import moduleName="Diagnostics" />
        </Imports>
        <Startup>
          <Task commandLine="Run.cmd" executionContext="limited" taskType="background" />
        </Startup>
        <Endpoints>
          <InputEndpoint name="Http_Listener_1" protocol="tcp" port="8080" localPort="8080" />
          <InputEndpoint name="Http_Listener_2" protocol="tcp" port="8181" localPort="8181" />
          <InputEndpoint name="Http_Listener_3" protocol="tcp" port="4848" localPort="4848" />
          <InputEndpoint name="JMX_Connector_Port" protocol="tcp" port="8686" localPort="8686" />
          <InputEndpoint name="Remote_Debug_Port" protocol="tcp" port="9009" localPort="9009" />
        </Endpoints>
      </WorkerRole>
    </ServiceDefinition>

    As you can see, the Startup element instructs Windows Azure to execute Run.cmd as a startup task. And the InputEndpoint elements are used to specify ports that GlassFish server needs to listen to for external connections.

    4. Open Launch.ps1 and make a few edits. I kept the existing functions unchanged. The parts that changed are shown below (and highlighted):

    $connection_string = 'DefaultEndpointsProtocol=http;AccountName=dachou1;AccountKey=<your acct key>
    
    # JRE
    $jre = 'jre-1.6.0_23.zip'
    download_from_storage 'java' $jre $connection_string (Get-Location).Path
    unzip ((Get-Location).Path + "\" + $jre) (Get-Location).Path
    
    # GlassFish
    $glassfish = 'glassfish-3.1-b37.zip'
    download_from_storage 'apps' $glassfish $connection_string (Get-Location).Path
    unzip ((Get-Location).Path + "\" + $glassfish) (Get-Location).Path
    
    # Launch Java and GlassFish
    .\jre\bin\java `-jar .\glassfish3\glassfish\modules\admin-cli.jar start-domain --verbose

    Essentially, update the script with:

    • Account name and access key from your Windows Azure Storage Blob service (where you upload the JRE and GlassFish zip files into)
    • Actual file names you’re using
    • The appropriate command that references eventual file locations to launch the application (or call another script)

    The script above extracted both zip files into the same location (at the time of this writing, in Windows Azure, it is the local storage at E:\approot). So your commands need to reflect the appropriate file and directory structure based on how you put together the zip files.

    5. Upload the zip files into Windows Azure Storage Blob service. I followed the same conventions of placing them into ‘apps’ and ‘java’ containers. For GlassFish, I used the glassfish-3.1-b37.zip downloaded directly from glassfish.java.net. For Java Runtime (JRE) I zipped the ‘jre’ directory under the SDK I installed. To do the upload, many existing tools can help you do this from a user’s perspective. I used Neudesic’s Azure Storage Explorer. As a result, Server Explorer in Visual Studio showed this view in Windows Azure Storage (and the containers would show the uploaded files):

    image

    That is it! Now we can upload this project into Windows Azure and have it deployed. Just publish it and let Visual Studio and Windows Azure do the rest:

    image

    Once completed, you could go to port 8080 on the Website URL to access GlassFish, which should show something like this:

    image

    If you’d like to use the default port 80 for HTTP, just go back to ServiceDefinition.cscfg and update the one line into:

    <InputEndpoint name="Http_Listener_1" protocol="tcp" port="80" localPort="8080" />

    GlassFish itself will still listen on port 8080, but externally the Windows Azure cloud environment will receive user requests on port 80, and route them to the VM GlassFish is running in via port 8080.

    Granted, this is a very simplified scenario, and it only demonstrates that the GlassFish server can run in Windows Azure. There is still a lot of work that are needed to enable functionalities Java applications expect from a server, such as systems administration and management components, integration points with other systems, logging, relational database and resource pooling, inter-node communication, etc.

    In addition, some environmental differences such as Windows Azure being a stateless environment, non-persistent local file system, etc. also need to be mitigated. These differences make Windows Azure a little different from existing Windows Server environments, thus there are different things we can do with Windows Azure instead of using it simply as an outsourced hosting environment. My earlier post based on the JavaOne presentation goes into a bit more detail around how highly scalable applications can be architected differently in Windows Azure.

    Java deployment options for Windows Azure

    With Windows Azure SDK 1.3, we now have a few approaches we can pursue to deploy Java applications in Windows Azure. A high-level overview based on my interpretations:

    Worker Role using Visual Studio deployment package – This is essentially the approach outlined in my earlier work with Jetty and Windows Azure. With this approach, all of the files and assets (JRE, Jetty distributable, applications, etc.) are included in the Visual Studio project, then uploaded into Windows Azure in a single deployment package. To kick-off the Java process we can write bootstrapping code in the Worker Role’s entry point class on the C# side, launch scripts, or leverage SDK 1.3’s new Startup Tasks feature to launch scripts and/or executables.

    Worker Role using Windows Azure Storage Blob service – This is the approach outlined in Mario Kosmiskas’ article. With this approach, all of the files and assets (JRE, Jetty distributable, applications, etc.) are uploaded and managed in the Windows Azure Storage Blob service separately, independent of the Visual Studio project that defines the roles and services configurations. To kick-off the Java process we could again leverage SDK 1.3’s new Startup Tasks feature to launch scripts and/or executables. Technically we can invoke the script from the role entry class in C# too, which is what the initial version of the Tomcat Accelerator does, but Java teams may prefer an existing hook to call the script.

    Windows Azure VM Role – The new VM Role feature could be leveraged, so that we can build a Windows Server-based image with all of the application files and assets installed and configured, and upload that image into Windows Azure for deployment. But note that while this may be perceived as the approach that most closely aligns to how Java applications are deployed today (by directly working with the server OS and file system), in Windows Azure this means trading off the benefits of automated OS management and other cloud fabric features.

    And perhaps, with the new Remote Desktop feature in Windows Azure, we can probably manually install and configure Java application assets. But doing so sort of treats Windows Azure as a simple hosting facility (which it isn’t) and defeats the purpose of all the fault-tolerance, automated provisioning and management capabilities in Windows Azure. In addition, for larger deployments (many VM’s) it would become increasingly tedious and error-prone if each VM needs to be set up manually.

    In my opinion, the Worker Role using Windows Azure Storage Blob service approach is ideally suited for Java applications, for a couple of reasons:

    • All of the development and application testing work can still be accomplished in the tools you’re already using. Visual Studio and dealing with Windows Azure SDK and tools are only needed from a deployment perspective – deploying the script that launches the Java process
    • Managing individual zip files (or other archive formats) and at any granularity level, instead of needing to put everything into the same deployment package when using Visual Studio for everything
    • Loading application assets from the Windows Azure Storage Blob service also provides more flexibility for versioning, reuse (of components), and managing smaller units of changes. We can have a whole set of scripts that load different combinations of assets depending on version and/or intended functionality

    Perhaps, at some point we could just upload scripts into Blob service and the only thing is to tell the Windows Azure management portal to run, as the Startup Task, which scripts for which roles and instances from the assets we store in the Blob service. But there are still things we could do with Visual Studio – setting up IntelliTrace, facilitating certificates for Remote Desktop, etc. However, treating the Blob service as a distributed storage system, automating the transfer of specific components and assets from the storage to each role instance, and launching the Java process to run the application, could be a very interesting way for Java teams to run applications in Windows Azure platform.

    Lastly, this approach is not limited to Java applications. In fact, as long as any software components that can be loaded from Blob storage, then installed in an automated manner driven by scripts (and we may even be able to use another layer of script interpreters such as Cygwin), they can use this approach for deployment in Windows Azure. By managing libraries, files, and assets in Blob storage, and maintaining a set of scripts, a team can operate and manage multiple (and multiple versions of) applications in Windows Azure, with comparatively higher flexibility and agility than managing individual virtual machine images (especially for smaller units of changes).

  • Architecture + Strategy

    .NET and Multiple Inheritance

    • 6 Comments

    Occasionally I get questions on why does .NET not support multiple inheritance. It is actually a pretty interesting question to contemplate with, though I usually start the conversation by asking: "what issue requires multiple inheritance to solve?".

    More often than not, the question surfaces when people are trying to "do the right thing" by correctly refactoring code in an object-oriented manner, and facilitate code reuse by using inheritance, but encounter challenges when trying to reuse methods and code behaviors defined in separate places of the class hierarchy. Thus the most "natural" question was, if I can just inherit the code from these classes...

    Many decisions in language design, just like software projects, are balancing acts between various trade-offs. There are many very interesting conversations happening in the community, such as the debate on generics and closures on the Java side (for example: How Does Language Impact Framework Design? and Will Closures Make Java Less Verbose? and James Gosling's Closures). Interesting to see how much thought goes into each seemingly small decision on adding specific language features, or not adding them.

    There were many factors that influenced the .NET team to favor not implementing multiple inheritance. A few of the more prominent ones include:

    • .NET was designed to support multiple languages, but not all languages can effectively support multiple inheritance. Or technically they could, but the complexities added in language semantics would make some of those languages more difficult to use (and less similar to their roots, like VB, and for backward compatibility reasons) and not worth the trade-off of being able to reuse code in the manner of multiple inheritance
    • It would also make cross-language library interoperability (via CLS compliance) less of a reality than it is today, which is one of the most compelling features of .NET. There are over 50 languages supported on .NET in over 70 implementations today
    • The most visible factor is language semantics complexity. In C++ we needed to add explicit language features in order to address ambiguities (such as the classic diamond problem) caused by multiple inheritance, such as the "virtual" keyword to support virtual inheritance to help the compiler resolve inheritance paths (and we had to use it correctly too)
    • As we know code is written 20% of the time, but read 80% of the time. Thus advocates on simplicity side prefer not to add language features for the sake of keeping semantics simple. In comparison C# code is significantly simpler to read than C++ code, and arguably easier to write

    Now Java doesn't support multiple inheritance as well, though probably for a different set of reasons. Thus it is not a case of simple oversight in design or lack of maturity, as it was a careful and deliberate decision to not support multiple inheritance in the .NET and Java platforms.

    So what's the solution? Often people are directed to using interfaces, but interfaces don’t lend themselves very well to meet the requirements of reusing code and implementing separation of concern; as interfaces are really intended to support polymorphism and loosely-coupled/contractual design. But other than trying to tie behaviors into object inheritance hierarchies, there are many alternative approaches that can be evaluated to meet those requirements. For example, adopting relevant design patterns like Visitor, frameworks like MVC, delegates, mixins (interfaces combined with AOP), etc.

    Bottom line is, there are considerably elegant alternatives to inheriting/deriving behavior in class hierarchies, when trying to facilitate code reuse with proper re-factoring. Plus trade-offs in code reuse vs. the costs incurred to manage the reuse is another full topic in itself. In some cases it may have been simpler to have multiple inheritance and access to sealed classes, but the trade-offs may have been greater costs in other areas.

Page 1 of 28 (137 items) 12345»