As I mentioned previously, the study of clouds (cloud computing of course) is becoming a very popular topic in the software industry. In the last couple of weeks alone, I read tens of articles on the subject. I found many of them proposing various type of taxonomies for cloud computing, utility computing, PaaS etc. even more offering futuristic predictions, including but not limited to, the doom of "on premise" software, but I found extremely few attempting to explain the architectural impact of "the cloud".
With this gaping void in mind (btw similar to the void that existed about 2.5 years ago around architectural impact of SaaS) I decided to spend some cycles on trying to understand the implications of cloud computing for large enterprises and ISVs. To get started on the enterprise angle, I used a simple, yet powerful technique: I asked. I asked various 'office of the CIO' type folks who I happen to meet quite often in my job and tried to extract the commonalities of what they were telling me. I then bounced some ideas around with trusted colleagues and once refined, I validated these ideas with another group of 'office of the CIO 'type, making sure I was not completely off the mark. Below my finding.
'do it yourself' vs. 'as a service'
The first finding (which happens to be quite obvious after the facts) is that the most important element that an IT architect has to understand with regards to the cloud, is the impact of the fundamental question that the business or IT will ask itself: “what will I do myself” versus “what will I get 'as a service'”.
'do it yourself' will give you control. But if you do it yourself you will not be able to tap into economy of scale; quite understandably, if you do it yourself, the scale is 1 (you) no much economy there. You bear the full cost.
On the other hand, if you get something “as a service” you can tap into higher economy of scale. By leveraging the fact that the “as a service” provider is providing the service to hopefully thousands, if not hundreds of thousands of customers, you benefit from the economy of scale that the provider is capable of achieving. But you have little control on what you get.
So key takeaway #1: as illustrated in the picture below, at the highest level, you are trading control for economy of scale (and vice versa)
who builds it and where does it run
The second element that is important to understand is that in cloud-aware world, there are 2 dimensions of “do it yourself” vs. “as a service”
First dimension: Who builds it? (the good old build vs. buy); this directly impacts the control of FEATURES. If you build the software, you control the features that will be in the software, if you get the software from a service provider you get the features that are offered by the provider (very logical isn't it).
Second dimension: Where does it run? This choice impacts the control of SLA. If you run your stuff yourself, 'on premise', you have full control of the SLA. Note that controlling the SLA is different from having a high SLA or doing a better job than the guys in the cloud. It means that you are able to control what the SLA is. If you use the cloud, you get the SLA that is given to you.
Once again, as I mentioned earlier, for both of them (SLA and features), control comes at the expense of economy of scale.
map of possibilities
Why is this important? Because these 2 dimensions create a “map” of possibilities that enterprises can use for their IT assets.
Enterprises are now capable of deciding, along these 2 dimensions, where they want control of features and/or control of SLA at the expense of economy of scale. No area on the map is a “better choice” than another, it is about making sure that the various IT assets are placed where they should be, based on relevance to the business, compliance to regulation etc.
The table below gives some examples of IT assets type, based on the level of control along both features and SLA. In the top left corner, you find the classic 'packaged software deployed on premise'. By running it yourself, you have full control of SLA, but being a packaged software you have low control of features. The low control of features is compensated by high economy of scale of features. The software vendor, amortizing the R&D cost across hundred/thousands of customers, can build features cheaply than you can do yourself. The bottom left area is where we find the good old "build and run on premise" software, for example an homegrown banking system. There, you have full control of SLA and features since you are doing everything yourself but you have no economy of scale. Both the cost of developing the features and running the software can only be divided by 1 (you). The top right area is the canonical 'SaaS' offering. The economy of scale is high on both the features and the SLA, but you have little control on features and SLA. The intermediate columns (@hoster and @cloud) are deployment options with decreasing SLA control compared to doing it yourself but increasing economy of scale.
(note: one could justifiably argue about whether the economy of scale is higher @cloud or @vendor; the rationale to place them in this order is that @cloud gives you more control than @vendor; the assumption here is that you would be deploying your own software or packaged software in a cloud compute environment and therefore have some level of control on how much computing power you want to allocate to your applications, as opposed to @vendor where you have 0 control)
Now that we have discussed some of the theory behind this, let’s go through a semi-hypothetical and largely simplified scenario. (I say semi-hypothetical because this scenario without being a 100% real one, is highly inspired from an actual conversation I had with a large pharmaceutical company.)
In this scenario, there are a couple of IT assets they built themselves, as they wanted very unique features and some other assets they sourced from the market as they “just” wanted what everybody else had. In other words, they made significant investments in assets they wanted competitive differentiation (e.g. clinical trial management software and new molecule research) and purchased from the market 'common in the industry', non-differentiating assets (CRM, Email,...). In addition to these choices, they ran all of their IT themselves, owning and therefore having control on the SLA of their entire IT environment.
Although this picture is quite common, my discussion with this company CIO surfaced that this map did not represent how they wanted to run their IT. They knew they were spending too much of their budget on non-differentiating assets, limiting the amount of investment they could make on differentiating assets.
In other words, they way they would like to run their IT is better reflected by the picture below.
Email and CRM not being seen as competitive differentiators, it is ok to trade control on SLA and features for much higher economy of scale (shift to the right); legacy HR system built in house for historical reason should be pushed up for gaining economy of scale in terms of features, but would be kept in house for keeping the control of SLA. Clinical trial software, being an asset providing competitive advantage, gets a double down in terms of investments (thanks to the saving of pushing some assets to the right). The new molecule research software is pushed to the cloud to get access to elastic computing resources (variable peak computation) as well as cheaper storage (at the expense of control of SLA) but although it is running 'off premise', the development is kept in house to keep full control on features.
As you can see, even in this highly simplified environment, the goal is to clearly understand where keeping control makes sense and where it is better to tap into economy of scale and place the assets accordingly.
crossing the chasm
Unfortunately, this is easier said than done.
Reusing a sentence made popular by Geoffrey A. Moore in his book (albeit in a completely different context), pushing software out to the cloud (e.g. CRM in the example above) as well as projecting cloud software back into the corporate boundary (e.g. the new molecule research software) is very much like crossing a chasm. And it is precisely that chasm crossing that architects must master.
The architectural challenges are multiple; the major ones can be categorized in 3 buckets: identity, management and data. Examples of identity challenges are around cross boundaries authentication and authorization, single sign on and identity lifecycle. Examples of management challenges are around cross firewall SLA monitoring and cloud software management action triggering (halting, pausing, throttling). Example of data challenges are data ownership, portability, reporting and privacy. As you can see, a lot of good stuff for architects to become even more indispensable :)
To be honest, I do not have all the answers yet, but now that hopefully the a clear scenario has been described, and the cloud impact of this scenario is better understood, I hope you will be joining us in our new journey in discovering and describing the underlying black magic required to master the cloud.
In future posts we will be going through these 3 buckets in more details, we will be discussing high level architecture(s) that this semi-fictitious "Big Pharma" company could put in place to smoothly cross the chasm, as well as describing the set of 'on premise' and 'cloud technologies' that can be leveraged to do all that. And of course, similarly to what we did with LitwareHR it would not be surprised if we threw a few bits and reference model in the mix as well :) Finally in addition to the enterprise angle, we will exploring the complementary view, the ISV perspective.
Although it was not the initial intent, now that I wrote all this, I find that this post has an eery similarity to Fred's invitation back in February 2006, when we started our SaaS architecture work and invited everybody to walk the journey with us. Hopefully this ride will be as fun as the previous one.