Azure, AWS, Cloud .... exciting stuff.  These are all modern incarnations of hosting environments, and developers are finding them attractive because of the resume polishing impact that a successful service or application has.

When done right, this kind of app lowers costs to the business by reducing run-costs, equipment costs, and letting the decison on when to add more capacity become simple.  But getting there is not easy.  Keeping them running is the bigger challenge - so visiblity into what is going well and what isn't going well is a critical factor of building an app that can be managed from remote.  Thus, I coin a term - the managed application.

A managed application is different than a running application in that it is designed to be run with a very low cost profile that goes beyond the deployment and hardware savings associated with the new forms of hosting environments.  Instead, a managed application reduces the largest run-cost - the cost associated with downtime.  When a service that is making money or saving money is down, the costs (opportunity cost, reputation, SLA driven reimbursements, career success) start to add up quickly.  Every second spent diagnosing a problem is a second that these costs grow.

Based on access to operational data for some of the most popular services on the planet, let me share some fuzzed details (fuzzed to protect the specifics).  In general, the measure I am intested in is MTTR - the mean time to restore when an incident is noted.  Let's define incident as "something a paying customer notices that results in dissatisfaction with the decision to use the service in the first place".  The incident clock starts ticking when the customer notices.  It ends when the customer can no longer notice.  Whether the underlying problem is fixed or not is not important - the impact is no longer noticeable.  Thus, the acronym breaks down to "Mean Time To Restore" - restore service.

There are some lossy moments in calculating this.  Not being able to see that the cusotmer is noticing a problem is a challenge.  If you don't know the service is impaired, you are wasting valuable time and the costs are mounting.  Thus the time between when the incident starts and when the team responsible for keeping costs low (hello operations!) is important - think of this as the monitoring latency that I talked about in my post last week.

So what do we do about it?  How do we deal, as developers, with making sure these important elements are a part of our design?  Read on - this week I introduce the "managed application" as an experience that development is ultimately in control of, and I would argue, should be held accountable for.

What is a managed application?

Developers that are setting out to create cloud hosted applications need to invest time and resources to create their applications so that they can be automatically managed by the cloud operating environment (“fabric” controller) and remotely managed where necessary.  Hosted applications run in constrained environments that optimize traditional hardware and facility costs.  In these environments, such as Windows Azure, the constraints come in the form of limits on what the development and operations teams in a business can directly configure.  To live within these constrained environments, new development practices must be employed. These practices together allow developers to create a new class of applications that were designed to be remotely managed.  I am calling this class of applications “Managed Applications”[1].

The three tenets that distinguish a managed application from traditional applications are:

  • Deployment is fully automated: The constraints in managed environments prevent traditional installer-based deployment steps from being successful.  To run in modern hosting environments, applications must support fully automated deployment based on file sets or full server images. This requires an explicit separation of deployment and configuration, such that application configuration can be de-coupled from application binaries and maintained independently. Today, deployment and configuration is typically combined in the installer-based approach commonly used.
  • Remote Monitoring is fully supported:  Because the apps are hosted in a 3rd party environment, one cannot manually intervene in the configuration of a new server. 
    Hosted applications are more difficult to monitor because the physical control and location of a running application is opaque.  For this reason, managed applications need to support simple monitoring of capacity, resource consumption, performance and availability by the final consumer. In addition, managed application teams may
    want to be able to see the aggregate results of what their end customers can see via monitoring, as well as the ability for remote diagnosis of edge issues that are relevant to the development team.
  • Runtime upgrade support:  A managed application that runs in a hosted environment exhibits excellent uptime.  Consumers of hosted applications expect 24x7
    availability. Thus, a managed application supports break-fix deployment and version upgrade without interruption to running customer transactions. This will require a level of autonomy and loose-coupling [2]to ensure whole or part of the application can be shut-down and started-up without causing wider concerns due to interdependencies.

 These three tenets roughly align to the well-known application life cycle.  An application goes through a development phase where it is designed, coded and tested.  Then it goes through a deployment and readiness phase.  The final phase is the production phase.  It is in the production phase that a managed application exhibits substantially better cost performance than traditional application designs.

Want to read more on this topic?  Read the full whitepaper here:

[1] While the focus of the post is on the development of cloud-ready “managed” applications, a lot of the tenets are applicable for modern, on-premise deployments as well.

[2] See Application patterns for green IT for some ideas on this topic.