The advances in processors, memory, storage, and connectivity have paved the way for next-generation applications that are data-driven, whose data could reside anywhere (i.e. on the desktops, mobile devices, servers, and in the cloud) and that require access from anywhere (i.e. local, remote, over the network, from mobile devices, in connected and disconnected mode). This trend has led to the development of distributed, multi-tiered, and composite application architectures for the web and for the enterprise. A typical enterprise application accesses data from multiple data sources, integrates that data, re-shapes (or transforms) that data into a form most suitable for the application (typically into object form like C# or Java object), and writes application logic. The same is true of web applications – consider social networking apps or mashups – they access data from multiple web sources, over the internet, aggregate it, execute application logic, and generate pages for web interaction. As these styles of multi-tiered web and enterprise application are becoming main stream, the demand for application performance and scale is increasing. End users become less tolerant and more frustrated when a web application cannot respond in milliseconds; web applications that cannot scale, as the number of concurrent accesses increase, lose traffic and thereby business. Fundamentally, we have all begun to expect high performance and scale from every application. And let’s not forget application availability. For similar reasons to those I describe above, an application cannot be down. We cannot imagine the MSN portal or the Amazon web site, or the corporate SAP financial application being down when we need it. We expect to access our personal information on MSN at any time; consumers do business with Amazon at any time and from anywhere. Fundamentally, applications need to be available all the time to support access at any time, and from anywhere. Another major expectation, especially from application developers and from application hosters is that of scalable and available applications at a low cost. A decade ago, only mission-critical businesses could afford to invest in large and expensive infrastructure (both hardware and software) to support scale and availability of their applications. But, now with web hosting, everyone expects and demands high scale and availability at low cost. Extending this even further, not only developers want cheap scalable and available applications, they want the ability to develop (and deploy) such applications very rapidly.

To cope with competitive pressure, both from an innovation and a deployment perspective, rapid development and deployment of these applications is critical for application vendors.  In turn, application developers are looking for application infrastructure that enables them to build highly performant, scalable, and available applications using commodity hardware and software, at a rapid pace. Traditional application platforms like the .NET and Java platforms, which are known for rapid multi-tier application development and deployment, are required to provide the scalability and availability infrastructure. 

Distributed cache is becoming the key application platform component for providing scalability and high availability. In-memory caching has been traditionally used primarily for meeting the high performance requirements of applications. By fusing caches on multiple nodes into a single unified cache however, the distributed caches offer not only high performance, but also scale. By maintaining copies of data on multiple cache nodes (in a mutually consistent manner), the distributed cache can also offer high availability to applications. Distributed caches are especially ideal for applications with the following characteristics:

  • There is a considerable number of data requests that are mostly read (e.g. product catalogs)
    • Large concurrent access to such data can be provided by replicating the catalog data on multiple cache nodes. Since updates are infrequent to such data, maintaining consistency (synchronously or asynchronously) is not very expensive
  • Applications that can tolerate some staleness of data
    • Such applications can provide better performance and scale by not requiring immediate updates ore refreshing of caches
  • Applications that can work with highly partitioned data (e.g. session data, shopping cart)
    • High scale and performance can be supported by partitioning and distributing data across multiple cache nodes, and thereby distributing data processing across the cache nodes
  • Applications that can work well with eventual consistency
    • Consider a flight inventory application, which must satisfy a large number of concurrent read/writes to the inventory of seats. To support large scale, the distributed cache may replicate the inventory value on multiple nodes; however, the inventory values on different nodes have to be made consistent in some fashion.  Requiring immediate (also known as strong) consistency will require updates to be synchronously propagated to all the copies. Such action would impact the overall performance and scale of the application. However, instead of immediately making the copies consistent, allowing them to eventually (in an asynchronous manner) become consistent will provide low latency, high performance access to inventory.

As distributed caches become more widely deployed, I believe over the next few years, distributed cache will be used as the first tier of all data access. Multi-tier application architecture will include the cache tier as a data access tier between the application server tier and the backend data tier.

Today, Microsoft is announcing the first CTP of a distributed caching product to provide the .NET application platform support for developing highly performant, scalable, and highly available applications. The project code named “Velocity” is a distributed cache that allows any type of data (CLR object, XML document, or binary data) to be cached. “Velocity” fuses large numbers of cache nodes in a cluster into a single unified cache and provides transparent access to cache items from any client connected to the cluster. The Data Platform Developer Center at http://msdn.microsoft.com/data and the Velocity Team Blog at http://blogs.msdn.com/velocity provides additional information about project code named “Velocity” as well as links to download our first CTP.

Distributed caches are not new – during the last couple of years several caching products have emerged to address the performance and scalability needs of applications. Most of these products are point products, primarily supporting key-based access. Other than memcached, which is an open source technology, most others target enterprises and enterprise workloads and scale. I think the web workloads require considerably large scale, with 1000s of cache nodes in a cluster. The web scale distributed caches not only require mechanisms that can scale and provide availability in very large clusters, they must be easy to manage or self-managed. In the Future, “Velocity” envisions being an integral part of the .NET application stack targeting both enterprise and web workloads (and scale). As applications start using the caches for data access, I also believe, they will demand richer data services like query, transactions, analytics, synchronization etc. For example, we believe .NET applications will require LINQ queries on the distributed cache, the same way they query the backend SQL Server database. We envision “Velocity” becoming such a comprehensive distributed caching platform. The performance, scale, and availability functionality of “Velocity” along with its rich data services will allow for rich web and enterprise applications development and deployment.

Anil Nori
Microsoft Distinguished Engineer